Hi, I’m recently update my pytorch from 1.4.0 to 1.8.0. However, when i running my code without any changes, an error occurs:
Traceback (most recent call last):
File "Loss.py", line 768, in <module>
lossB.backward()
File "/Users/opt/anaconda3/envs/torch_18/lib/python3.7/site-packages/torch/tensor.py", line 233, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/Users/opt/anaconda3/envs/torch_18/lib/python3.7/site-packages/torch/autograd/__init__.py", line 146, in backward
allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [16]] is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
So i using torch.autograd.set_detect_anomaly(True)
and i get this:
/Users/opt/anaconda3/envs/torch_18/lib/python3.7/site-packages/torch/autograd/__init__.py:146: UserWarning: Error detected in MseLossBackward. Traceback of forward call that caused the error:
File "Loss.py", line 755, in <module>
net_param2=list(modelB.parameters()))
File "Loss.py", line 335, in loss_cocorrecting_plus
loss_net = self._net_loss(net_param1, net_param2)
File "Loss.py", line 105, in _net_loss
loss += torch.nn.functional.mse_loss(param1, param2)
File "/Users/opt/anaconda3/envs/torch_18/lib/python3.7/site-packages/torch/nn/functional.py", line 2631, in mse_loss
return torch._C._nn.mse_loss(expanded_input, expanded_target, _Reduction.get_enum(reduction))
(Triggered internally at /Users/distiller/project/conda/conda-bld/pytorch_1607242180650/work/torch/csrc/autograd/python_anomaly_mode.cpp:104.)
allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag
Traceback (most recent call last):
File "Loss.py", line 768, in <module>
lossB.backward()
File "/Users/opt/anaconda3/envs/torch_18/lib/python3.7/site-packages/torch/tensor.py", line 233, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/Users/opt/anaconda3/envs/torch_18/lib/python3.7/site-packages/torch/autograd/__init__.py", line 146, in backward
allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [16]] is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
Here is a simplified version of my code:
def cal_net_dis(net1_param, net2_param):
loss = 0
for param1, param2 in zip(net1_param, net2_param):
loss += torch.nn.functional.mse_loss(param1, param2)
return loss
modelA = CNN()
modelB = CNN()
optimizerA = torch.optim.SGD(modelA.parameters(), lr=0.01)
optimizerB = torch.optim.SGD(modelB.parameters(), lr=0.01)
for img, target in dataloader:
outputA = modelA(img)
outputB = modelB(img)
lossA_ = F.cross_entropy(img, target)
lossB_ = F.cross_entropy(img, target)
net_dis = cal_net_dis(list(modelA.parameters()), list(modelB.parameters())
lossA = lossA_ + net_dis
lossB = lossB_ + net_dis
optimizerA.zero_grad()
lossA.backward(retain_graph=True)
optimizerA.step()
optimizerB.zero_grad()
lossB.backward()
optimizerB.step()
when i change the position of optimizerA.step()
, this error disappeared:
optimizerA.zero_grad()
lossA.backward(retain_graph=True)
optimizerB.zero_grad()
lossB.backward()
optimizerA.step()
optimizerB.step()
I am very confused, is this operation safe? Or is there a recommended method of such operation? Or maintain the current code?