Gradient flow of different loss

I have 4 losses which is used for backward out of which one is custom loss. These losses have different scaling. For example, Loss1 and Loss2 are in range of e-01, Loss3 is in range of e+01. Through hook i checked the gradients at output are in range of e-04 to e-05 for Loss1 to Loss3. Now custom Loss4 have high value and when it backpropagate to output it gives the gradients in range of e+03 to e+04.

First, i tried to scale Loss4 but there is no improvement in result. Second, I scale the gradient of Loss4 at output through hook but it doesn’t have any effect. I apply all possible scaling. I have used Adam optimizer.

Am i going in right direction or missing something? How to choose hyperparameter value for result improvement?

By 4 losses, do you mean something like
loss = lambda1*loss1 + lambda2*loss2 + lambda3*loss3 + lambda4*loss4 and scaling the lambda parameters? As a sanity check you can verify that the gradients should be different with lambda4 = 0.


Yes its different.

From your suggestion i start to observe the response of individual loss. Loss1 is GANLoss which is MSE loss and i check for only GANLoss with 150 training images as sample check. I observe that during start Generator Loss and Discriminator Loss is in range of 1.6 and it quickly drop to range of 0.7 in second epoch and in range of 0.3 in third epoch. I run for 10 epochs but output image is not impressive. PSNR and SSIM during validation(just taken 4 images) is also fluctuating with wide range. Gradients at output image come in range of e-04 to e-05 from 1st epoch to 10th epoch.

Is the network is not converging or i need to check with more number of images?