 # Can we calculate gradients for quantized models?

I tried to use torch.autograd.grad() to calculate gradients for a quantized model, just as what we usually do on full precision models:

``````    for idx, (inputs, targets) in enumerate(data_loader):
outputs = quantized_model(inputs)
loss = criterion(outputs, targets)
``````

But I got a RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
Does models quantized with PyTorch Quantization currently do not support backpropagation? Is there some methods I can calculate the gradients for PyTorch quantized models?

quantized models currently run only during inference so you can only call forward on them. If you are trying out quantization aware training Quantization Recipe — PyTorch Tutorials 1.9.1+cu102 documentation, we do support back-propagation in that case during training.

Thank you for the reply. I know that quantization aware training use fake quantization during training, which simulates quantization with fp32. I want to know that what is the difference between fake quantization and real quantization, especially when we do back-propagation on them?

fake quantization simulates quantization but uses high precision data types

so for example imagine if you were trying to quantize to integers.

mathematically a quantized linear op would be:

X = round(X).to(int)
weight = round(weight).to(int)
out = X*weight

whereas a fake_quantized linear would be
X = round(X).to(fp32)
weight = round(weight).to(fp32)
out = X*weight

In practice quantized weights are stored as quantized tensors which are difficult to interact with in order to make them able to perform quantized operations quickly.

fake_quantized weights are stored as floats so you can interact with them easily in order to do gradient updates.

most quantized ops do not have a gradient function so you won’t be able to take a gradient of it. Note: even quantization aware training doesn’t really give gradients of the model, see: [1308.3432] Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation

Thank you so much for your clear explanation ! : )