Okay, so this works:
q = torch.tensor([2], requires_grad = True, dtype = torch.double)
XR = torch.zeros([2], dtype = torch.double)
XR[0] = 2
XR[1] = q[0]**2
z = torch.sum(XR)
z.backward()
q.grad
tensor([ 4.], dtype=torch.float64)
But this does not:
q = torch.tensor([2], requires_grad = True, dtype = torch.double)
XR = torch.zeros([2], dtype = torch.double)
XR[0] = 2
XR[1] = q[0]**XR[0]
z = torch.sum(XR)
z.backward()
q.grad
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
Obviously, this is a silly toy example, but I am ultimately trying to do numeric integration and it would be nice to find a way to do this. Is it possible, or do I have to use a work around?
Thanks,
DS
The problem is, that you modify the tensor XR
inplace.
To avoid this, you could use XR[1] = (q[0]**XR[0]).clone()
which allocates new storage and creates a new tensor.
Hmmm, maybe I’m doing something wrong?
q = torch.tensor([2], requires_grad = True, dtype = torch.double)
XR = torch.zeros([2], dtype = torch.double)
XR[0] = 2
XR[1] = (q[0]**XR[0]).clone()
z = torch.sum(XR)
z.backward()
q.grad
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
Also, I should have put this in the initial post, I will eventually be considering the case where X[0] is not just a scalar, but a function of q as well and I will want to get the gradient of X[0] with respect to q.
Just wanted to update this thread, I know it’s garnering a lot of interest.
justusschock was right. you can use .clone() and avoid the error.
There was just a small issue with the code, it should be this:
XR[1] = q[0]**XR[0].clone()
I am still unsure as to why the .clone() is required. I know it has something to do with the graph getting messed up, but would love to know what exactly is going on.
THANKS!
DS
1 Like