Inplace error, in autograd function -pytorch

el_youssfi_azeddine · April 13, 2021, 11:05am

Hello,
I try to evaluate mu modele on gpu, but when I send data to gpu and the modele I get this error.
I searched for inplace errors but in this function (autograd) I did’nt find any.
Someone of you have an idea?
this is the error!
[
49 def Divergence(var:Tensor,Field:Tensor) → Tensor:
50 ones = torch.ones_like(Field)
—> 51 dFdvar = torch.autograd.grad(Field, var,
52 grad_outputs=ones,
53 create_graph=True, retain_graph=True)[0]

~\Anaconda3\lib\site-packages\torch\autograd_init_.py in grad(outputs, inputs, grad_outputs, retain_graph, create_graph, only_inputs, allow_unused)
200 retain_graph = create_graph
201
→ 202 return Variable.execution_engine.run_backward(
203 outputs, grad_outputs, retain_graph, create_graph,
204 inputs, allow_unused)

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [51126]], which is output 2 of struct torch::jit::`anonymous namespace’::DifferentiableGraphBackward, is at version 3; expected version 2 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
]
thanks !

ptrblck · April 14, 2021, 5:22am

If you cannot find the inplace operation, you might want to add .clone() operations to the activations and check, which line of code causes the issue. This is a bit cumbersome, so you could also check again for all inplace ops. They should either use the inplace=True argument or are used via an inplace op directly, e.g. a += b.

el_youssfi_azeddine · April 14, 2021, 6:37am

Hello, thanks for reply.
I directly replace all a+=b by a=a+b, now there is no such operation in my code.
I read somewhere to use “with torch.no_grad():” function I tried it but I get an other error " element 0 of tensors does not require grad and does not have a grad_fn"

ptrblck · April 14, 2021, 6:51am

No, you shouldn’t use the no_grad() guard, if you need to compute gradients. This wrapper is used during validation or testing in order to save memory, as intermediate tensors are not stored, which would be needed to compute the gradients.

el_youssfi_azeddine · April 14, 2021, 6:54am

thanks for your quick answer, this one of the reasons I like this forum
Actually I’m in validation phase, I called the trained model and stroed it in GPU, I send new data to GPU that I will give to model to predict.

ptrblck · April 14, 2021, 6:56am

During the validation phase you shouldn’t get the initial error message, as it would be caused by a backward() call, which should probably not used during validation.

el_youssfi_azeddine · April 14, 2021, 6:59am

You’re right,I got the initial error before using torch.no_grad(), I’m trying this in order to avoid the gradient error.

el_youssfi_azeddine · April 14, 2021, 7:17am

There is something I don’t understand, why the model is needing to review gradient calculation during validation phase… if this error occurs because of gradient, why not just disable grad computation?

ptrblck · April 14, 2021, 10:45pm

This is an issue, which I also don’t understand in your use case.
The error is raised by Autograd, as it’s detecting an inplace operation.
However, you explained that you are seeing it during the validation step, which wouldn’t calculate the gradients in the common use case, so you would have to check why you are trying to calculate gradients during the validation step.

el_youssfi_azeddine · April 15, 2021, 3:59am

thanks for reply.
Infact, I’m working with PINNs (physics informed neural network) I’m trying to solve PDE with neural network, In my loss function definition, we use gradient and divergence of the temperature to express the physic loss.
so those gradient above I’m talking about are calculated in forward method of the method to retrun all physics items that the model calculate and used to define physic loss.
thanks