- No, the parameters will get updated in the
optimizer.step()
call. The gradients of parameters of reused modules will get accumulated, if the corresponding computation graph uses them.
A small illustration of my last post:
Assuming your model architecture is:
input -> conv1 -> conv2 -> conv3 -> conv4 -> conv5 -> output -> loss2
\-> conv4_output -> loss1
If this is the workflow of the loss calculations, then loss1.backward()
will accumulate gradients for the parameters in conv1,2,3,4
, while loss2.backward()
will accumulate gradients for the parameters in conv1,2,3,4,5
.
The same applies for the sum of both losses.