Training with multiple neural nets

marcus_pazini · July 26, 2021, 4:04am

I’m fairly new to pytorch and currently i would like to train a model wich consists in the following:
I have one already trained model (net1) wich receives as an input the output of a untrained model (net2), and generates on output. (net2 → net1 → output)
Then my question the is, is it possible compute the gradient through net1 and net2 but only run an optimizer.step() and update the weights of net2?

ptrblck · July 26, 2021, 7:57am

Yes, that’s possible and you could directly pass the output of net2 to net1, compute the loss, and call loss.backward() to compute the gradients.
If you don’t want to train net1, you could freeze all parameters of it via:

for param in net1.parameters():
    param.requires_grad = False

To update the parameters of net2, pass its parameters to the optimizer via:

optimizer = torch.optim.SGD(net2.parameters(), lr=1e-3)

marcus_pazini · July 26, 2021, 12:34pm

Thank you for the quick answer, I have modified my code so it follows the instructions you suggested, is it correct?

for param in net1.parametsrs():
     param.requires_grad = False

optimizer = torch.optim.SGD(net2.parameters(), lr=1e-3)

for i in torch.randperm(len(train_inputs)):
     x = train_inputs[i]
     y_hat = net2(net1(x))
     loss = criterion(y[i], y_hat)
     optimizer.zero_grad()
     loss.backward()
     optimizer.step()

ptrblck · July 26, 2021, 5:35pm

The code looks generally alright, but you’ve switched the model order.
In your initial post you’ve claimed net2 -> net1 -> output, while you are now using net1 -> net2 -> output.
Also, I don’t know which criterion you are using, but often the model output is the first argument and the target the second, so you might want to switch this as well.

marcus_pazini · July 26, 2021, 5:43pm

Yes, I switched the order the second time, sorry for my mistake!
My criterion is the MSELoss wich I use to compute the loss with the chained model output.
It seems to be working allwright so far!

criterion = nn.MSELoss()