- No, the parameters will get updated in the
optimizer.step()call. The gradients of parameters of reused modules will get accumulated, if the corresponding computation graph uses them.
A small illustration of my last post:
Assuming your model architecture is:
input -> conv1 -> conv2 -> conv3 -> conv4 -> conv5 -> output -> loss2
\-> conv4_output -> loss1
If this is the workflow of the loss calculations, then loss1.backward() will accumulate gradients for the parameters in conv1,2,3,4, while loss2.backward() will accumulate gradients for the parameters in conv1,2,3,4,5.
The same applies for the sum of both losses.