I want to know how to set arguments of optimizer() when loading a pretrained net with its gradient fix to a new net. My understanding that I should assign only the remaining other subset of a new_net to optimizer first argument. Assume that the pretrained net is a subset of a new net.
Here is a example;
pretrained_net = PretrainedNet
new_net = NewNet
set grad of pretrained_net fix
for param in pretrined_net.parameters()
param.requires_grad = False
#Copy pretrained_net state_dict to a new_net state_dict with others remaining same in new_net state_dict.
pretrained_dict = pretrained_net.state_dict()
Assume that I know the remaining other subset of the new_net includes;
a = pytorch.nn.aaa()
b = pytorc.nn.bbb()
c = pytorch.nn.(ccc()
When I tried to set up optimizer
optimizer = Adam([a.parameters(), b.parameters(), c.parameters()])
I got TypeError: optimizer can only optimize Variables, but one of the params is Module.parameters.
What is the right coding for optimizer? Thank you for any help in advance.
However, the nets coming from pretrained_model turns out to be requires_grad = True in the new_net although I already set those nets to be requires_grad = True explicitly. It is because copying pretrained_net state_dict to a new_net state_dict with others remaining same in new_net state_dict looks not preserving grad status and initializes the nest to default, True. What is the right way to set the only nets coming from the pretrained_model to be requires_grad = False after copying their state_dict to a new_net state_dict?
I don’t come up with any good idea at this moment, though, how about making NewNet handle PretrainedNet as its one layer like nn.Conv2d and nn.Linear?
Then it will be easier to load weights and change requires_grad. I’m not sure this is good but I paste a toy snippet below.
def __init__(self, **kwargs):
self.pretrained = PretrainedNet()
# register layers unique to NewNet below.
def forward(self, x, **kwargs):
# forward computation
new_net = NewNet()
pretrained_net_state_dict = torch.load(path/to/state_dict) # or PretrainedNet()
new_net.pretrained.requires_grad = False
Again, I’m not confident with if this works well for you. But I hope this helps.