I was trying to run my code and ran into an error about an in-place operation. I’ve simplified it to the bare minimum so it’s easier to understand. If I’m not wrong, autograd doesn’t like that I modify the tensor y in-place, even if I don’t modify the same elements. Is there a way to allow it? Or is there another way to do what I’m trying to do (use previous slice and parameter u to compute next slice in a differentiable way)?
I hope it’s somewhat clear (it’s my first post).
def dydt(u, y):
dy = u * y
u = torch.ones(1).requires_grad_()
y = torch.zeros(3, 1)
a = dydt(u, y)
y = a + y
b = dydt(u, y)
y = b + y
So to compute the gradient of
u, PyTorch needs the intermediate version of y. As the versions (available through the not-official-api-use-at-your-own-risk
y._version) are only kept per-tensor and not per-entry, you are seeing that the backward of the product
u * y complains about the version of
y being different from what it had in the forward pass.
I’d probably work with a list and
torch.stack the result at the end if you need the
y-tensor. You could also clone y every here or there so that the inplace modifications don’t hit the
y tensor used at critical places in the forward.
It is quite operation-specific which inputs and outputs are saved for the backward, but here it will be the multiplication (
*), and indeed if you follow the debugging hint to enable anomaly detection, this is what you are pointed at.
Thank you for this answer! I’ll go with a list and then stack the elements at the end.