Adding additional loss function to a pretrained network

Assume we have a network-1 with a set of convolution layers (final conv feature map size is B.512.16.16) followed by avgpool2D (B.512) and lastly a FC layer (output size B.128). I have trained this network with loss 1. Later i load this network and finetune it with either loss 2 or with loss 1 + loss 2.
Here are the steps i used in finetuning:

  1. Load network 1 (lets name it model_ft)
  2. Define optimizer = optim.Adam(model_ft.parameters(), lr=0.0001)
  3. Compute network 1 output(lets say out1= (B.128) and out1c = (B.512.16.16)) for some input. Assume network 1 has two outputs (out1, out1c)
  4. Use above output and do
    weights_for_maps =, model_ft.fc.weight)
    cam_map = to get out_final of size (B.512.16.16) activation map.
  5. Compute Loss 2 = pixel wise loss (out_final, some ground truth)
  6. Loss 2.backward(), optimizer.step()

Current Problem: Network is training but gradients are zero. Or Loss 2 is constant over epochs. However if i use (Loss 1+ Loss 2).backward(), gradients are non zero because of Loss 1.

Are the gradients just small or are they not calculated, if you call backward on loss2?

Gradients are small. I see small updates in parameters initially. After 2 epochs, parameters are not being updated and thus Loss is also constant. Thanks

Any suggestions plz?

If the parameters are not being updated after some iterations/epochs, you might want to increase the learning rate (or experiment with other optimizers).