Hi,
If I understand properly, you have this:
input = # ...
out1 = Model1(input)
out2 = Model2(out1)
loss = LossFunc(out2)
If you want to optimize the parameters of Model1, you can just use loss.backward() and create your optimizer to only update the first model with optimizer = torch.optim.SGD(Model1.parameters(), other_args). That way, Model1 will be updated but not Model2.
Note that whatever you’re gonna do, to be able to get gradients for Model1, you will have to backpropagate through Model2 (this is how backprop works).