Hello all.
I have begun to learn about Quantization with “dynamic quantization” as a first try.
According to several tutorials like this or this, it catches my attention that torch.nn.Relu
is used in this example from the official tutorials in the qconfig_spec argument of the function quantize_dynamic:
quantized_model = torch.quantization.quantize_dynamic(
model, {torch.nn.Linear}, dtype=torch.qint8
)
The tutorial says: “We specify that we want the torch.nn.Linear modules in our model to be quantized”. But to me, it is obvious that you want them to be converted to int8. You probably want everything you can convert to int8 converted.
With this already being said, I would like to know if it is necessary to specify the Linear module.