In Tensorflow, we can do backward like this:
loss = criterion(output, target)
tf.add_loss(loss)
total_loss = tf.losses.get_total_loss()
optimizer = tf.train.MomentumOptimizer(learning_rate=init_lr, momentum=0.9).minimize(total_loss, global_step=global_step)
But in Pytorch, I can’t find any methods like add_loss or get_total_loss. Then I try to use a custom loss function like this:
class Loss_func(nn.Module):
def __init__(self):
super(Loss_func, self).__init__()
self.totalLoss = 0
return
def forward(self, output, target):
temp_loss = criterion(output, target)
self.totalLoss = self.totalLoss + temp_loss
return self.totalLoss
I found this error:
RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
I changed the backward method like this:
loss.backward(retain_graph=True)
A new error occurs:
Warning: Error detected in AddmmBackward. Traceback of forward call that caused the error:
File “/vol/research/Jigsaw/jigsaw_torch.py”, line 231, in train
features, outputs = model(batchImg.cuda())
File “/lib/python3.8/site-packages/torch/nn/modules/module.py”, line 727, in _call_impl
result = self.forward(*input, **kwargs)
File “/vol/research/Jigsaw/jigsaw_torch.py”, line 86, in forward
x = self.fc(x)
File “/lib/python3.8/site-packages/torch/nn/modules/module.py”, line 727, in _call_impl
result = self.forward(*input, **kwargs)
File “/lib/python3.8/site-packages/torch/nn/modules/linear.py”, line 93, in forward
return F.linear(input, self.weight, self.bias)
File “/lib/python3.8/site-packages/torch/nn/functional.py”, line 1690, in linear
ret = torch.addmm(bias, input, weight.t())
(function _print_stack)
Traceback (most recent call last):
File “/vol/research/Jigsaw/jigsaw_torch.py”, line 439, in
main()
File “/vol/research/Jigsaw/jigsaw_torch.py”, line 435, in main
train(trainSet, testSet, tripletSet, model, opt, lossFunc, epoch)
File “/vol/research/sketch/Jigsaw/jigsaw_torch.py”, line 234, in trainAndValidate
loss.backward(retain_graph=True)
File “/lib/python3.8/site-packages/torch/tensor.py”, line 221, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File “/lib/python3.8/site-packages/torch/autograd/init.py”, line 130, in backward
Variable._execution_engine.run_backward(
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1024, 81]], which is output 0 of TBackward, is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
The model is GoogleNet. Does this error mean that the self.fc = nn.Linear(1024, num_classes) in GoogleNet doesn’t support calculating backward twice?
How can I use total loss to calculate backward?