Custom loss functions

  1. No, the parameters will get updated in the optimizer.step() call. The gradients of parameters of reused modules will get accumulated, if the corresponding computation graph uses them.

A small illustration of my last post:
Assuming your model architecture is:

input -> conv1 -> conv2 -> conv3 -> conv4 -> conv5 -> output -> loss2
                                           \-> conv4_output -> loss1

If this is the workflow of the loss calculations, then loss1.backward() will accumulate gradients for the parameters in conv1,2,3,4, while loss2.backward() will accumulate gradients for the parameters in conv1,2,3,4,5.
The same applies for the sum of both losses.