About transfer learning (Please! help me !)

Sena_Andromeda · February 29, 2020, 4:55pm

Recently I have been working on a footprint image classification project. And I have just read the transfer learning tutorial from the official documentaion from the pytorch website(here).
All the footprint images are like this :

The dataset has 20 persons’ footprint images and each person has 10 footprint images.
I tried the scripts from the tutorial mentioned before. And I have only changed the part for data transforming into this:

data_transforms = {
‘train’: transforms.Compose([
#transforms.RandomResizedCrop(224),
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.RandomHorizontalFlip(),
#transforms.RandomVerticalFlip(),
transforms.RandomRotation(degrees=15),
transforms.RandomRotation(degrees=30),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]),
‘val’: transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]),
}

So, here is the result that has been puzzling me for pretty much time now.
When I use this:
model_ft = models.resnet50(pretrained=True)
num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs, 20)
I could only get the acc of about 60%.But when I didnt change the fc layer like this:
model_ft = se_resnet50(pretrained = True)
I could even get the result of acc reaching even 97.5%.
Here are my questions:
1.When I changed the fc layer it means it could only output 20 probabilities(Since I have a dataset of 20 persons).But , why is that I got worse results if I changed the fc layer???
2.I have been considering using the SEnet(se-resnet50 here) to improve my result(from 97.5%).But I just got the acc of about 80% which is worse than resnet50. How could it be worse?
3.If I want to imporve my results(from 97.5%), what I am supposed to do ??

Thanks in advance. I would appreciate all kinds of help.

ptrblck · March 2, 2020, 6:47am

The initialization of your new layer might not be suitable for the use case (or any other hyperparamter to train the new layer).