I encounter this problem when trying to assign value for tensor like below (RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation) . Notice that my in-place operation is before calculate loss = (b*b).mean()
.
import torch
print(torch.__version__) # 1.8.1
torch.autograd.set_detect_anomaly(True)
a = torch.rand(3, requires_grad=True)
b = torch.sigmoid(a)
b[0] = 1
loss = (b*b).mean()
b.retain_grad()
loss.backward() # RuntimeError: one of the variables needed for gradient computation has been modified by an in-place operation...
After that, I found that if replace torch.sigmoid()
step by step like below, the .backward() could work successfully.
import torch
print(torch.__version__) # 1.8.1
torch.autograd.set_detect_anomaly(True)
a = torch.rand(3, requires_grad=True)
# b = torch.sigmoid(a)
b = 1 / (1+torch.exp(-a))
b[0] = 1
loss = (b*b).mean()
b.retain_grad()
loss.backward() # success
print(b.grad) # tensor([0.6667, 0.4498, 0.4630])
print(a.grad) # tensor([0.0000, 0.0987, 0.0982])
In addition, I also found that using .index_fill()
like below, the code could also work.
import torch
print(torch.__version__) # 1.8.1
torch.autograd.set_detect_anomaly(True)
a = torch.rand(3, requires_grad=True)
b = torch.sigmoid(a)
# b[0] = 1
b = b.index_fill(0, torch.LongTensor([0]), 1)
loss = (b*b).mean()
b.retain_grad()
loss.backward() # success
print(b.grad) # tensor([0.6667, 0.4498, 0.4630])
print(a.grad) # tensor([0.0000, 0.0987, 0.0982])
In conclusion, I have the following questions:
- Why is there in-place error with using torch.sigmoid()? But the step by step has no problem.
- Why after using index_fill the code could work?
Thanks for reply!