Hi, I have a specific question. In each training step, I have two pairs of input, (x1, y1) and (x2,y2), and the model is composed of two parts model_a and model_b. The loss is computed by |y - model_b( model_a( x ) ) | .
The problem is, that I want the loss computed from pair (x1, y1) to update both the parameters of model_a and model_b, however, the computed loss from (x2, y2) to update only model_a’s parameters. Is this possible?
I have tried the following, I used two optimizers optim_a and optim_b, optim_a updates the parameters of both model_a and model_b, while optim_b only updates the parameters of model_a.
optim_a = torch.optim.Adam( model_a.parameters()+model_b.parameters() )
optim_b = torch.optim.Adam( model_a.parameters() )
In the training phase, I compute both losses like,
loss1 = |y1 - model_b( model_a( x1 ) ) |
loss2 = |y2 - model_b( model_a( x2) ) | .
Then,
optim_a.zero_grad()
optim_b.zero_grad()
loss1.backward()
optim_a.step()
loss2.backward()
optim_b.step()
However, when conducting loss2.backward, I met “RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation”, that’s because the parameters of model_a and model_b are changed when conducting loss2.backward().
Any ideas to solve the problem? Nedd Help!!