VGG16 'fc' error when performing Transfer learning

when performing transfer learning I was using the optimizer SGD and the pre-trained network is VGG16 I got this error the code which I used was

optimizer_transfer = optim.SGD(model_transfer.fc.parameters(),lr= 0.001)
AttributeError: 'VGG' object has no attribute 'fc'

when I removed the fc from the code I get the following error

optimizer_transfer = optim.SGD(model_transfer.parameters(),lr= 0.001)
ValueError: optimizing a parameter that doesn't require gradients

These error occurs only when the I use the VGG16

That is because VGG16 model doesn’t have model.fc instead it has model.classifier. Change your code to

optim.SGD(model_transfer.classifier.parameters(), lr=0.001)

Also, it is always better to first print the model and see the way they are named.


The final error is because I think you are trying to freeze some part of the network and I guess you frozen everything(I am not sure though). In optim.SGD change the code from

model_transfer.parameters() to filter(lambda p: p.requires_grad, model_transfer.parameters())

to avoid the error but check thoroughly through your network to see the parameters and frozen layers.

thanks for your reply @god_sp33d
but I got this for the first solution
ValueError: optimizing a parameter that doesn't require gradients

but for the second solution
initially I set my requires_grad is set to False and the model did not train but when I set it to True and run it the batch loss I get it so high 24.034479

my code is
for param in model_transfer.parameters(): param.requires_grad = True criterion_transfer = nn.CrossEntropyLoss() optimizer_transfer=optim.SGD(filter(lambdap:p.requires_grad,model_transfer.parameters()),lr=0.001)

The first change I suggested was for your error: model.fc, I am assuming you fixed that by changing to model.classifier.

coming to my second suggestion. you are getting the following error because you are trying to include the parameters in the optimizer where requires_grad = False. So, I suggested you to filter out those parameters by applying filter function.

ValueError: optimizing a parameter that doesn't require gradients

I think these were your concerns in the original post. As you are getting a very high batch loss. Can you please add more details like batch size ? Also, I see that you are using a pretrained model and finetuning. S0, why don’t you start with a lower learning rate like 0.0001 ?

Also, how does the loss change as you progress ? Does it decrease or very unstable ? If yes, can you show some values ?