RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [], which is output 0 of PermuteBackward, is at version 3; expected version 0 instead

Hi Zhouker!

This line of code calling self.qkv().reshape().permute() doesn’t match the call to
self.qkv() quoted in the forward-call traceback produced by detect_anomaly().

Are you sure that you’re looking at the right sections or versions of your code here?

This looks like a problem. As I understand it, x (at this point in the code) is a tensor in
the computation graph.

Note that writing into a tensor with indexing is an inplace modification, so it looks like
x[i] = ... is introducing an inplace modification into the computation graph.

What is the shape of x at this point as confirmed by printing out x.shape? Does
it match the shape of the modified tensor reported in the original error message?

What are your print statements showing for x._version? Is it changing, indicating an
inplace modification? Do the values of x._version line up with those in the original
error message?

You can find a discussion of how to debug inplace-modification errors in this post.

Inplace-modification errors can often be “fixed” (that is, swept under the rug) with pytorch’s
allow-mutation context manager.

Best.

K. Frank

1 Like