I am using an off-shelf action recognition system(TSN) that I pretrained on 195 classes. Now I want to finetune it on another dataset that has 25 classes. When I load my pretrained model I do NOT get an error of a mismatch between classes even though I have not changed the last layer. Did the system adopt to the new last layer byitself or I am doing something wrong because I am getting very low accuracy?
Most likely your model didnât adapt itself to the new number of classes. Since the new dataset contains less classes than the one the model was pre-trained on, you wonât get an out of index error.
However, you are âwastingâ model capacity, since the majority of the output neurons arenât used (neurons corresponding to class26 to class195).
I would recommend to change the last layer corresponding to the new number of classes as I would guess youâll see a performance boost.
Thanks your very much for your fast reply.Do you think the accuracy will increase?
That would be my guess, but I canât promise anything.
So I added this line model.fc = torch.nn.Linear(195,25) to change the last layer from 195 to 25, but I am getting this error
raise KeyError(âmissing keys in state_dict: â{}ââ.format(missing))
KeyError: âmissing keys in state_dict: âset([âfc.weightâ, âfc.biasâ])ââ
It seems you are using a pre-trained model. If thatâs the case, you should load the state_dict
with the old architecture and change the last layer afterwards. This will make sure that all parameters will be found.
Thank you very much for answering my questions. Now that I froze all the layers except the last fully connected layer and after changing my optimizer from
for group in policies: print(('group: {} has {} params, lr_mult: {}, decay_mult: {}'.format( group['name'], len(group['params']), group['lr_mult'], group['decay_mult']))) optimizer = torch.optim.SGD(policies, args.lr, momentum=args.momentum, weight_decay=args.weight_decay)
to
optimizer = torch.optim.SGD(filter(lambda p: p.requires_grad, model.parameters()), args.lr )
I am getting this error
File âmain.pyâ, line 307, in adjust_learning_rate
param_group[âlrâ] = lr * param_group[âlr_multâ]
KeyError: âlr_multâ
Any idea how I can fix it, thank you again.
Could you explain a bit, how youâve created policies
?
Is it a custom dict
?
I did not create it. I am using an off shelf action recognition system TSN pytorch. They defined it as
policies = model.get_optim_policies()
Iâm not familiar with TSN, but apparently policies
does not contain the key 'lr_mult'
.
Could you check the repo and see if your usage is correct?
If you canât figure out the problem, I think the best approach would be to create an issue in the TSN repo.
before doing my edits, the code worked fine. But when I added my own optimizer it gave me this error.
I guess you are using this repo.
In your code you are filtering out all parameters which do not require gradients, thus you are probably breaking the intended usage of the code.
Have a look at these lines of code.
I think if you create your own policies
dict using this code for your filtered parameters, it should work again.
Have you finished the problem? i would be appreciated it if you could share with me.