What is the impact of using x+=y in the forward method of Pytorch?

pytorchzzz · October 27, 2023, 4:32am

I encountered the following error in backpropagation during model training (at the bottom). After changing x+=y to x=x+y, I solved this problem. What I know is that x+=y is modified on the original value of x, and x=x+y will generate new values. So, what is the impact of generating new values on the backpropagation of the model? There is another question, if the value cannot be modified in place, then my previous code had x+=y. Why can it run correctly? I cannot run it after adding some activation functions.
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch. cuda. FloatTensor [256, 128, 310]], which is output 0 of ReluBackward0, is at version 1; Expected version 0 installed Hint: the backend further above shows the operation that failed to compute its gradient The variable in question was changed in there or anywhere later Good luck!

J_Johnson · October 27, 2023, 5:32am

If using inplace operations, from what I understand, PyTorch autograd cannot track the operation history to the graph and thus cannot determine the gradients. Please see here:

https://pytorch.org/docs/stable/notes/autograd.html

ptrblck · October 27, 2023, 2:44pm

Autograd only disallows inplace operations on tensors when their original values are needed for the gradient computation, which means Autograd understands and accepts inplace ops only if they are mathematically correct.
@KFrank explained it in more detail with a great example here.