This question is about tensors, computation graphs and inplace operations.
Here are some operations that try to modify the same underlying data in storage.
c = torch.randn(10, 3, 2, 32, requires_grad=True)
print(c.is_leaf)
# operation 1
try:
c += 1 # RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.
except RuntimeError:
print(RuntimeError)
# operation 2
try:
torch.add(c, 1, out=c) # RuntimeError: add(): functions with out=... arguments don't support automatic differentiation, but one of the arguments requires grad. Grad will not be copied (only data will be copied).
except RuntimeError:
print(RuntimeError)
# operation 3
try:
c.add_(1) # RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.
except RuntimeError:
print(RuntimeError)
# operation 4
try:
c.index_copy_(3, torch.tensor(10), torch.randn(10, 3, 2, 1)) # RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.
except RuntimeError:
print(RuntimeError)
# operation 5
d = c.detach()
d += 1 # no error, and still leaf
print(c.is_leaf)
# operation 6
c[:] += 1 # no error, but not leaf anymore
print(c.is_leaf)
There are even more such operations if you consider numpy()
and from_numpy()
.
What is going on here😨? Why sometimes you can operate the underlying data of tensor c
while sometimes not? And why does inplace operation on indexed tensor make is_leaf
False
.
And what’s the lesson here? Are such behaviors intended and have some special use cases? Or should we try to avoid inplace operation on tensor that requires grad?