Hello ,
My loss function contains three parts loss = p1* loss1+p2* loss2+p3* loss3
I want to first get the gradient of the parameters of loss1, loss2, loss3 individually (for some purpose) and then update the net parameters by optimizing the total loss, i.e.,
loss1.backward()
(l1 is a layer in my network)
loss1.l1.weight.grad
loss2.backward()
loss2.l1.weight.grad
loss3.backward()
loss3.l1.weight.grad
no optimization involved up to now and I just want to know the gradient of each loss function. and then I want to update the parameters of the network by
opt.zero_grad()
loss.backward()
opt.step()
However, it gives me the error message and ask me to use retain_graph=True when I first call backward() in each individual loss function.
There will be no bug if I use retain_graph=True, however, is this the correct way to solve the problem? Is there any simpler way for me to get the gradient of each loss function and then update the parameters by optimizing the total loss function? Thank you very much!