What does histogram of activation mean in Caliberation for INT8 in tensorRT?

agupta2 · April 19, 2018, 8:03pm

I am going through [url]http://on-demand.gputechconf.com/gtc/2017/presentation/s7310-8-bit-inference-with-tensorrt.pdf[/url] and it says (slide 18) while calibrating to find optimal value for threshold using KL Divergence, it collects “histogram of activations”, what exactly does activation refer to? Also the graph (on the left) has y-axis as “Normalized Number of Counts”, which is also unclear to me. Any reference or brief idea would be great!

Thanks!

SteveNV · April 23, 2018, 10:17am

Dear agupta2,

it collects “histogram of activations”, what exactly does activation refer to

Activation = output produced by any layer of the network, in other words: result of forward pass, typically layers operate like this (for inference):

output_activation = layer(constant_weights, input_activation_from_previous_layer)

and in TRT we are collecting histograms of absolute values of output activations from all layers, histograms are collected from fp32 runs.

y-axis as “Normalized Number of Counts”, which is also unclear to me

Histograms contains “counts” (integers), for the plots normalized, so:
normalized_count = count / sum(all_counts)

so all normalized counts sum to 1, and for example value “0.3” means that 30% of activations for that layer were in that histogram bucket.

agupta2 · April 23, 2018, 2:16pm

Makes sense now, thanks a lot!

Thanks SteveNV!

luca.puglia89 · June 1, 2018, 10:42am

Hi SteveNV,

can you be more specific when you say:

we are collecting histograms of absolute values of output activations

in that slide (page 18) the histogram of vgg19:conv3_4 is shown, I really don’t get where those points come from, that layer has multiple channel (256 if i’m corect), which output value are you considering?

EDIT:
do you also have to scale back the quantized distribution in order to compare it with the original one?