Finetuning self supervised model

Hello Everyone,

I have a question regarding the use case that I am working on. So, my task is to train AlexNet in self-supervised manner first, by passing the rotated images of the CIFAR10 dataset, and train the model to predict the rotation

After that, I need to extract the features of the first two conv layer of the self-supervised model and use the features to train another model in supervised way on CIFAR10 dataset by just adding a fully connected layer on top of the first two conv layers extracted from self-supervised model. So, currently, I am saving the model parameters and loading the parameters to the Alexnet class. Then, passing the model as a parameter to another class(Alexnet_supervised). I am not sure if this is the correct approach for the same.

Model1 = Alexnet()

checkpoint = torch.load(‘best_model.pt’)

Model1.load_state_dict(checkpoint[‘model_state_dict’])

Model2 = Alexnet_supervised(Model1)

Could anyone help me with this?.

Thanks in advance.

I don’t know, how Alexnet_supervised is implemented, but assuming it’s using the first two layers of Model1 and adds a linear layer on top of it, your approach should be correct.

Thank you @ptrblck, Yes I am doing exactly what you said. Using the first two layers of Model1 and adding a linear layer on top of it.
I have another question for the same use case. So, when I train my model for rotation prediction task with Adam optimizer, lr = 0.001, it gives less accuracy around 40%. But, if I change the learning rate to 0.0001, accuracy increases to 84%.
Is this normal for the model to behave like this or there is some bug in my code that I am not able to figure out?
Thanks in advance.

This might be expected, as a high learning rate wouldn’t allow the model to converge and the loss would get stuck at a particular level.
The first figure in e.g. this post is often used to illustrate this behavior.