how to optimize the parameters of model and loss at the same time?

Evahua · March 19, 2019, 1:22am

I use the code below to optimize the parameters both in the model and loss.
"trainable = [{‘params’: loss.parameters(), ‘lr’: args.lr},
{‘params’: model.parameters(), ‘lr’: args.lr}] "
and I found a strange phenomenon , if i use the above optimizing method , for training several epoches, the loss descend from 11.7 to 10.8 and just stay there .
but if I put all the parameters in the model ,and use the following code . the loss descend from 11.7 to 1.3
"trainable = [{‘params’: model.parameters(), ‘lr’: args.lr}] "

ps: the weights in the loss using the normalize, because if i didn’t use normalize ,the loss sometimes become “nan”

MariosOreo · March 19, 2019, 6:17am

Hi,

Did you add the whole params ( include model.parameters(), and loss.parameters() )to the optimizer in the first way ?
In the second way, model.parameters() can get model params and loss params so it can be updated and the loss descend. Could you post a code snippet and make us clearer?

Evahua · March 19, 2019, 6:45am

1,“Did you add the whole params ( include model.parameters(), and loss.parameters() )to the optimizer in the first way ?” yes
"trainable = [{‘params’: loss.parameters(), ‘lr’: args.lr},
{‘params’: model.parameters(), ‘lr’: args.lr}] "

2, i use the resnet and arcface for face recognition ,other than the original resnet code: i delete last FC layer from resnet.py , but put it in my arcface.py (which is my loss function) , and my training code as below:

raw_output = self.model(img) # i delete last FC layer from resnet50
raw_output = self.loss(raw_output, id) # the last FC layer is here.
loss = criterion(raw_output,id.long())

my optimization code:
"trainable = [{‘params’: loss.parameters(), ‘lr’: args.lr},
{‘params’: model.parameters(), ‘lr’: args.lr}] "