QAT with RNN -- RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

roldja · December 9, 2020, 9:11pm

I’ve seen several posts about this runtime error:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [16]] is at version 9; expected version 8 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

However, this only arises in my cases if I am running QAT

torch.quantization.prepare_qat(model, inplace=True)

After setting anomaly detection, I can see the following which leads me to believe it’s the quantization observers getting updated in-place and subsequently blowing up during the backward call:

…/torch_1_7_0/lib/torch/quantization/fake_quantize.py", line 100, in forward
self.ch_axis, self.quant_min, self.quant_max)
(function _print_stack)

Any advice here? Thanks!

TYC · April 5, 2021, 4:52am

Hi, I have the same issue. Did you get it solved?