So I have this:
class Net1(nn.Module):
... #defined it
Model1 = Net1()
optimizer1 #defined accordingly
#pretraining over a loss function, say loss_fn1
for i,(img,label) in enumerate(train_loader):
...
Best way to add a few more layer to this and do training over a new loss function (say loss_fn2) ???
Please let me know if I am being unclear.
One way I tried was to define these new layers as a new model and a new optimizer (say Model2 and optimizer2) and do this:
for i,(img,label) in enumerate(train_loader):
out = Model1(img)
out = Model2(out)
loss = loss_fn2(out,label)
loss.backward(retain_graph = True)
optimizer2.step()
optimizer1.step()
This looks fundamentally wrong to me now. Are the gradient of loss function by chain rule backpropagated to the pre-trained model parameters? I don’t think they are connected.
I don’t know how to patch these two models with parameters of first model coming from pretraining. Any lead or hint is appreciated. Thanks!!