RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 512, 4, 4]] is at version 3; expected version 2 instead

Walter_sh · May 20, 2021, 2:21am

hello!
I’m training my GAN code(pix2pixHD code

) and I have an error about inplace operation.
how can I correct this problem?
Thanks!

eqy · May 20, 2021, 3:35am

What is the output when you set
torch.autograd.set_detect_anomaly(True)?

Walter_sh · May 20, 2021, 4:33am

I didn’t set the torch.autograd.set_detect_anomaly(True).
Where should I write the code?

eqy · May 20, 2021, 4:37am

I think you can set it any time before you start model training.

Walter_sh · May 20, 2021, 4:43am

Thanks for the advice!
I set the detect _anomaly code at the start of the training part and the here is the error.

eqy · May 20, 2021, 6:53am

Ok, it looks like the issue is in a conv layer of the model. Can you show the model definition?

Walter_sh · May 20, 2021, 7:10am

Here is the layer code I implemented.

albanD · May 20, 2021, 12:55pm

Hi,

My best guess here would be that you compute all the forward and losses first, then you do a serie of backward/step for each loss one after the other?
The problem is that the step function modifies the parameters of your net inplace, but these values are needed to run the backward (by conv for example).

You want to make sure that you don’t do a step operation between the forward and the backward.