I have 5 classes and I’m using a Densenet121.
In scenario A I’m setting the ouput layer as follow:
def get_trainable(model_params):
return (p for p in model_params if p.requires_grad)
model =models.densenet121(pretrained=True)
num_ftrs = model.classifier.in_features
model.classifier = nn.Linear(num_ftrs, 5)
optimizer = torch.optim.Adam(
get_trainable(model.parameters()),
lr=0.001,
)
In scenario B, I’m leaving the default output layer:
def get_trainable(model_params):
return (p for p in model_params if p.requires_grad)
model = models.densenet121(pretrained=True)
optimizer = torch.optim.Adam(
get_trainable(model.parameters()),
lr=0.001,
)
With a fixed output, the model plateaus at 76% accuracy at 20 epochs.
With the default output, I’m a 76% acc at the 2nd epoch! and it goes up to 82%.
Can someone explain why the last layer has such an influence when everything else is the same?