RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation [torch.cuda.FloatTensor [64, 192, 16, 16]], which is output 0 of AsStridedBackward0, is at version 3; expected version 0 instead

KFrank · July 7, 2024, 6:15pm

Hi Oussama!

Depending on your use case, you might be able to “automatically” fix your
issue by using pytorch’s sweep-inplace-modification-errors-under-the-rug
context manager.

If you want to track down what is causing your issue, you can find some
techniques for debugging inplace-modification errors in this post:

"RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [64, 1]], which is output 0 of AsStridedBackward0, is at version 3; expected version 2 instead. Hint: the backtrace further a autograd

Hi Fahmyadan and Sangyoon! Here are some suggestions about how to track down (and maybe fix) inplace-modification errors. Note that an inplace modification in the forward pass is not necessarily* an error – it depends on whether and how the tensor that was modified is used in the backward pass. Note that inplace operations can be useful for saving memory – if you replace an innocent inplace operation with an out-of-place equivalent, your training will use more memory (and, to a minor e…

Just speculation, but your FloatTensor [64, 192, 16, 16] could
be the .weight of a Conv2d (192. 64, 16) or maybe a batch of 64
192-channel 16 x 16 image patches.

Good luck!

K. Frank