Back propagating loss function separately

I have fine tuned a pre-trained model on multiple task.
Optimising the model with multiple loss functions for different tasks

how these approach are different in optimising the model.
Approach 1: → L = L1 + alphaL2 + beta * L3; back propagating weighted total loss
Approach 2: → back propagating losses separately like, L1, alpha
L2 and then beta*L3

kindly explain these two way of back propagating losses

Both will yield the same results but the second approach would call backward multiple times and would thus add some runtime overhead.

2 Likes

thank you for the reply.
I have fine tuned the model with both approach compromising the runtime overhead for second approach but i get different results. what could be the possible reason?
I have back propagated loss in approach_2 like this →
loss_1.backward(retain_graph=True)
loss_2.backward(retain_graph=True)
loss_3.backward()

I don’t knot but would recommend to compare the actual loss values first, then the gradients etc. If you get stuck, minimize the code and post an executable code snippet reproducing the issue.