What do [De]QuantStub actually do?

I have scoured all documentation I could locate and am still confused on certain accounts:

Quantize stub module, before calibration, this is same as an observer, it will be swapped as nnq.Quantize in convert.

which unfortunately isn’t very helpful at all (which “observer”?). That last part of that sentence seems to suggest that at inference time this module will be replaced with one that does actual float-to-int8 data conversion. But what does this module do at calibration time?

Some more details would be very much appreciated.

2 Likes

QuantStub is just a place holder for quantize op, it needs to be unique since it has state.
DeQuantStub is a place holder for dequantize op, but it does not need to be unique since it’s stateless.

In eager mode quantization, users need to manually place QuantStub and DeQuantStub in the model whenever the activation in the code crosses the quantized and non-quantized boundary.

One thing to remember is for a quantized module, we always quantize the output of the module, but we don’t quantize the input of the module, so the quantization of the input Tensor should be taken care of by the previous module, that’s why we have QuantStub here, basically to quantize the input for the next quantized module in the sequence.

So in prepare, we’ll attach observer for the output of QuantStub to record the Tensor statistics of the output Tensor, just like for other modules like nn.Conv2d. observer is specified by the qconfig.

And in convert QuantStub will be swapped as nnq.Quantize module, and output of nnq.Quantize(input of nnq.Conv2d) will be quantized.

3 Likes

Thank you, Jerry, this helps.

To clarify

In eager mode quantization

– that is the only mode available at the moment, correct? The JIT does not quantize (yet?).

This might sounds like a nit, but I think the following actually reflects a fundamental difficulty in quantizing in eager mode:

so the quantization of the input Tensor should be taken care of by the previous module,

How does pytorch decide what is “previous”? The true sequence of layer invocations is determined by the procedural code in forward() and it can involve branching at runtime or just data flow merging like with skip connections.

(I am yet to succeed in quantizing resnet because of this, I suspect)

1 Like

– that is the only mode available at the moment, correct? The JIT does not quantize (yet?).

yeah, eager mode is the only mode that’s supported in public release, but graph mode is coming up in 1.6 as well.

How does pytorch decide what is “previous”?

PyTorch don’t do this in eager mode. that’s why in eager mode users need to manually place QuantStub and DeQuantStub themselves. this is done automatically in graph mode quantization.
eager mode will just swap all modules that has a qconfig, so user need to make sure the swap makes sense and set qconfig and place QuantStub/DeQuantStub correctly.

when i do qat(quant awaring training), the tutorials said I should put QuantStub and DeQuantStub between self.modules within forward. so what did they do at this time, just meaningful at inference and valid time?

yeah in training it’s a no-op, but after convert we replace it with a dequantize op

1 Like

thank you very much.
so they just work during module convert, and do not do anything during qat training even after model.eval()? I change the quant config and then valid the module, the mAP is different

The purpose is to quant part of the module during qat, but when i execute prepare_qat, i see that all of the modules are inserted FakeQuantize. QuantStub and DeQuantStub do not do anything. Is there any way to insert FakeQuantize to part of the module?

yeah you’ll need to configure qconfig correctly.

eager mode is hard to use in general, if your model works with torch.export it might be easier to try out our new flows: (prototype) PyTorch 2.0 Export Post Training Static Quantization — PyTorch Tutorials 2.0.1+cu117 documentation and How to Write a Quantizer for PyTorch 2 Export Quantization — PyTorch Tutorials 2.1.1+cu121 documentation