I have a model M and I am cloning it M.clone()
Now, I want to freeze certain layers of M.clone(). When I set requires_grad=False, it results in this error:
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().
How to freeze the layers of M.clone() in that case? I want to ensure that when I backpropagate using the loss computed on a batch using M.clone(), I compute the gradients of M
A small script:
model = ResNet()
optimizer = Adam(model.parameters())
cloned_model = model.clone() # .clone() is a custom method that creates a copy of the model
for p in cloned_model.features.parameters():
p.require_grad = False
error = loss(cloned_model(data), labels)
error.backward()
optimizer.step()
Thanks @anantguptadbl ! However, by adding cloned_model.features, I want to specify that I don’t want to freeze all layers. Regarding my above question, can you suggest anything?
Thanks @anantguptadbl, Resnet() is just an example model. You can consider any model (say a CNN with 4 layers). My question lies here itself. Doing cloned_model.layer1.parameters() results in the RuntimeError I mentioned above.
@ptrblck thanks for your reply in the other thread.
I had one more issue related to the loss problem. Can you please give some insights on this? Once again, thanks a lot!
.clone() is a tensor method and is undefined for nn.Modules so your code should already fail in model.clone() unless it’s a custom method which isn’t posted here.