A few questions about QConfig in quantization

minhhotboy9x · March 14, 2024, 2:23am

Hello, I’m a beginner in quantization. When we want to quantize a model, we must specify a qconfig for a model to choose scales and zero-points. For example:

QConfig(activation=torch.quantization.default_observer,
     weight=torch.quantization.default_observer)

I think that the weight param of QConfig is the observer of the weight tensors. But what about activation? Does activation observer watch the output values of a layer or the input? And it is to choose the scales and zero-points for the input of layers, right?

HDCharles · March 18, 2024, 9:48pm

for static quantization, the activation are quantized at the output. The input is assumed to be quantized.

for dynamic quantization, the output is not quantized while the input is so the qconfig gets applied to the input.

minhhotboy9x · March 19, 2024, 3:14am

Thank you. I am confused that the term activation here is the output of an arbitrary layer (conv2d, linear, …) right? Or it must be the output of an activation function?

HDCharles · March 19, 2024, 5:10pm

generally we use the term activation to mean any flow from one layer to another, known only at runtime. Whereas weights are known before runtime.