RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: PyTorch error

Li-Yun · November 20, 2020, 4:44pm

I am trying to run some code in PyTorch but I am got stacked at that point:

At first iteration, both backward operations, for Discriminator and Generator are running well

....

self.G_loss.backward(retain_graph=True)

self.D_loss.backward()

...

At the second iteration, when self.G_loss.backward(retain_graph=True) executes i got this error:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [8192, 512]] is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

According to torch.autograd.set_detect_anomaly, the last of the following lines in the Discriminator network, is responsible for this:

    bottleneck = bottleneck[:-1]
    self.embedding = x.view(x.size(0), -1)
    self.logit = self.layers[-1](self.embedding)

The strange thing is that I have used that network architecture in other code and worked properly. Any suggestion?

the full error:

  >   site-packages\torch\autograd\__init__.py", line 127, in backward

    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [8192, 512]] is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

ptrblck · November 23, 2020, 5:43am

It’s unclear what exactly is causing this error, since you haven’t posted a code snippet to reproduce this issue.
However, based on the description and since you are using a GAN, you might be facing a similar issue as described here.