Help RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

linnabrown · December 24, 2020, 4:45am

I meet a problem when I calculating the loss function.

Traceback (most recent call last):
  File "train.py", line 122, in <module>
    g_loss.backward()
  File "/nas/longleaf/home/lehuang/tool/anaconda/envs/deephic/lib/python3.7/site-packages/torch/tensor.py", line 221, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/nas/longleaf/home/lehuang/tool/anaconda/envs/deephic/lib/python3.7/site-packages/torch/autograd/__init__.py", line 132, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 256, 1, 1]] is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

InnovArul · December 24, 2020, 6:42am

linnabrown:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 256, 1, 1]] is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

As given in the error message, there seems to be a variable of size [1, 256, 1, 1] is modified in-place. If you would like to post some code, we might be able to help.

Just to give you an example,

x = torch.randn(5,6, requires_grad=True)
z = x.sum(-1)
z += z * z
z.sum().backward()

In the above sample code, z += z * z uses z to perform an operation, at the same time, It modifies z in place. When the gradient is calculated, the old z value that’s needed is already modified, thus resulting in error.

You can enable pytorch’s anomaly detection as below which may help you to locate the line that gives error:

import torch
with torch.autograd.set_detect_anomaly(True):
    x = torch.randn(5,6, requires_grad=True)
    z = x.sum(-1)
    z += z * z
    z.sum().backward()