A few questions about QConfig in quantization

Hello, I’m a beginner in quantization. When we want to quantize a model, we must specify a qconfig for a model to choose scales and zero-points. For example:

QConfig(activation=torch.quantization.default_observer,
     weight=torch.quantization.default_observer)

I think that the weight param of QConfig is the observer of the weight tensors. But what about activation? Does activation observer watch the output values of a layer or the input? And it is to choose the scales and zero-points for the input of layers, right?

for static quantization, the activation are quantized at the output. The input is assumed to be quantized.

for dynamic quantization, the output is not quantized while the input is so the qconfig gets applied to the input.

2 Likes

Thank you. I am confused that the term activation here is the output of an arbitrary layer (conv2d, linear, …) right? Or it must be the output of an activation function?

generally we use the term activation to mean any flow from one layer to another, known only at runtime. Whereas weights are known before runtime.

1 Like