Register_backward_hook in quantized model

firstly, unrelated to your question, the model above isn’t going to work well for quantization. You have self.quant used at multiple points in the forward, which means tha the quantization flow will only be able to assign a single set of quantization parameters that will need to be used in 2 places, drastically lowering accuracy.

as for your question, the issue is that there are no weight tensors in quantized modules. Those tensors get packed into a special format that the quantized kernel can utilize more effectively, so there’s nothing to do backprop on.

Generally the way something like this is done is by using fake quants i.e. modules that simulate quantized numerics with fp32 dtypes. Once training/whatever is complete, the model would then be converted to the quantized model.

see Quantization — PyTorch 1.13 documentation for more info