How to set arguments of optimizer() when loading a pretrained net with its gradient fix to a new net.?

sekigh · May 30, 2018, 1:37pm

I want to know how to set arguments of optimizer() when loading a pretrained net with its gradient fix to a new net. My understanding that I should assign only the remaining other subset of a new_net to optimizer first argument. Assume that the pretrained net is a subset of a new net.
Here is a example;

class PretrainedNet:
…

class NewNet
…

pretrained_net = PretrainedNet
new_net = NewNet

set grad of pretrained_net fix

for param in pretrined_net.parameters()
param.requires_grad = False

#Copy pretrained_net state_dict to a new_net state_dict with others remaining same in new_net state_dict.
pretrained_dict = pretrained_net.state_dict()
new_dict= new_net.state.dict()
new_dict.update(pretrained_dict)
new_net.load_state_dict(new_dict, strict=True)

Assume that I know the remaining other subset of the new_net includes;
a = pytorch.nn.aaa()
b = pytorc.nn.bbb()
c = pytorch.nn.(ccc()

When I tried to set up optimizer
optimizer = Adam([a.parameters(), b.parameters(), c.parameters()])

I got TypeError: optimizer can only optimize Variables, but one of the params is Module.parameters.

What is the right coding for optimizer? Thank you for any help in advance.

crcrpar · May 30, 2018, 1:56pm

Hi,

That would be solved by
Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=1e-3)

The details are here https://github.com/pytorch/pytorch/issues/679

sekigh · May 31, 2018, 2:27am

@crcrpar
Thank you for your advice. Yes, it works!

sekigh · May 31, 2018, 4:36am

@crcrpar
I encounter another issue in relation with the previous.

Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=1e-3)

works fine.
However, the nets coming from pretrained_model turns out to be requires_grad = True in the new_net although I already set those nets to be requires_grad = True explicitly. It is because copying pretrained_net state_dict to a new_net state_dict with others remaining same in new_net state_dict looks not preserving grad status and initializes the nest to default, True. What is the right way to set the only nets coming from the pretrained_model to be requires_grad = False after copying their state_dict to a new_net state_dict?

crcrpar · May 31, 2018, 5:06am

Whom…

I don’t come up with any good idea at this moment, though, how about making NewNet handle PretrainedNet as its one layer like nn.Conv2d and nn.Linear?
Then it will be easier to load weights and change requires_grad. I’m not sure this is good but I paste a toy snippet below.

"""Definition"""
class PretrainedNet(nn.Module):

class NewNet(nn.Module):
    def __init__(self, **kwargs):
        super().__init__()
        self.pretrained = PretrainedNet()
        # register layers unique to NewNet below.
        ...

    def forward(self, x, **kwargs):
        # forward computation


new_net = NewNet()
pretrained_net_state_dict = torch.load(path/to/state_dict)  # or PretrainedNet()
new_net.pretrained.load_state_dict(pretrained_net_state_dict)
new_net.pretrained.requires_grad = False

Again, I’m not confident with if this works well for you. But I hope this helps.

sekigh · May 31, 2018, 1:10pm

@crcrpar
Thank you for your advice. Although I am not fully sure, I start to implement codes in my case. I will let you know my trial soon.

sekigh · June 2, 2018, 4:38pm

@crcrpar,

I considered how to write scripts according to your advice. I am now stacked because two reasons as follows;

My new networks are better inside the pretrained_net because they have data transaction with some networks inside pretrained_network.
I extensively searched for the way to set a specific network with requires_grad = True, such as module_name.requires_grad = True, but I could not find such a method yet.

Therefore, I created another post to proceed in my original idea in a site below;

If you come up with any other idea, let me know.