Let’s say I want to train Autoencoder with two loss function(L1, L2).
Does the call order at the code line is the matter at gradient calculation?
For example, the code line will be looks like this.
loss_1 = nn.L1Loss()
loss_2 = nn.MSELoss()
input = torch.randn(3, 5, requires_grad=True)
target = torch.randn(3, 5)
output_1 = loss1(input, target)
output_2 = loss2(input, target)
loss = loss_1 + loss_2
loss .backward()
If I change the loss calculation sequence into this,
loss_1 = nn.L1Loss()
loss_2 = nn.MSELoss()
input = torch.randn(3, 5, requires_grad=True)
target = torch.randn(3, 5)
output_2 = loss2(input, target)
output_1 = loss1(input, target)
loss = loss_1 + loss_2
loss .backward()
Does this kind of change will affect the model output? I think it will because L2 norm is more restricted version of norms so the model need more epochs to merge.