Hello I am trying to incorporate multiple losses into a multi-model structure. My current workflow is as following.
optimizer = optim.SGD([{'params': model_1.parameters(), 'lr': 0.1},
{'params': model_2.parameters(), 'lr': 0.01},
{'params': criterion_1.parameters(), 'lr': 0.01}])
output_1, loss_1 = model_1(input)
output_2, loss_2 = model_2(output_1)
loss_3 = criterion_1(output_1, labels)
loss_4 = criterion_2(output_2, labels)
loss = loss_1 + loss_2 + loss_3 + loss_4
loss.backward()
optimizer.step()
I am worried if it will cause double gradient, because the backward()
will go through loss_1
, and loss_3
, both, kinda, come from model_1
. Similar as loss_2
and loss_4
, while criterion_2
does not have parameters to be optimized.
I am also wondering what double gradient can cause if happened.
Thank you very much.