I want to change the gradient (STE) of the quantized model so that it supports back propagation, but it fails. Can someone tell me how to use the register_backward_hook () function or other methods to realize this function in the quantized model?
firstly, unrelated to your question, the model above isn’t going to work well for quantization. You have self.quant used at multiple points in the forward, which means tha the quantization flow will only be able to assign a single set of quantization parameters that will need to be used in 2 places, drastically lowering accuracy.
as for your question, the issue is that there are no weight tensors in quantized modules. Those tensors get packed into a special format that the quantized kernel can utilize more effectively, so there’s nothing to do backprop on.
Generally the way something like this is done is by using fake quants i.e. modules that simulate quantized numerics with fp32 dtypes. Once training/whatever is complete, the model would then be converted to the quantized model.
Thank you for your reply! However, My requirement is to make the quantified model support back propagation. Therefore, the fake quants will not work. How can I get the weight tensors in quantized modules?
no, because in the final quantized model, the weights are compressed so that they are ready to be used for production, so its not in an intractable format. Its like asking ‘how can I edit a text file after I’ve compressed it into a zip’. You can’t, unless you want to repeatedly decompress it, make an edit, then recompress it. We don’t have any support for something like that since it’d be faster to just do the edits first then compress it after.
You’d use the command above which unpacks the quantized tensor.
Realistically you’d be better off writing a custom quantized linear/conv op which stores the quantized tensor normally, then packs it and uses the normal kernel when forward is called. You could then write the autograd function for this op.