Updating two sets of parameters using two optimizers FAILS

Yes, I have finally made it work thanks to God.

The problem was in using two optimizers. I don’t know why though. However, the reason behind using them is to update two sets of parameters separately. And this can be done using just one optimizer. Here is what I did:

First, I created two loss functions:

criterion = Criterion()
decoder_criterion = AnotherCriterion()

Then, I created one optmizer that monitors the whole model parameters:

opt = Optim(model.parameters())

Then, you can get the two losses separately like so:

model_loss = criterion(model_output, model_target)
decoder_loss = decoder_criterion(decoder_output, decoder_target)

Finally, you can perform the backward propagation over the two-loss functions at the same time like so:

loss = model_loss + decoder_loss
loss.backward()
opt.step()

This is how I fixed my problem. The following is a few more details in case you were interested:

  1. loss.backward(): all it does is to calculate the gradient of all parameters that were used in the loss's forward() method that has require_grad=True. Given a parameter x, this method saves its gradient wrt the loss inside the x.grad variable.
  2. opt.step(): all it does is to update these parameters using the gradient.
  3. When you use (loss1 + loss2).backward(): this performs the backward() over loss1 and loss2 separately and accumulate the gradient. You don’t have to worry.

This is what I learned so far. Please, don’t hesitate to correct me if I’m wrong.