I try to classify images from CheXpert dataset on only one of the observations (Alectasis) and a 2 class classification problem (1 true, 0 false).
I preprocess the images by resizing them to 224x224 and normalizing them. I used 30,000 pictures for training (10% validation) and 7,500 test images.
As a model I am using a pretrained ResNet34 on ImageNet.
When running the model it is overfitting: training loss decreases to 0.043 wheareas the validation loss rises to 2.199 which leads to a test accuracy of 55.56%.
I tried following attempts to prevent the overfitting:
#Attempt 1: I used a classifier with dropout layers
> Validation loss decreased in the beginning but after 6 epochs it was rising again
#Attempt 2: I added dropout layers in the whole model
> Validation loss decreased in the beginning but model converged very slowly. After some epochs, validation loss increased and model also converged faster
#Attempt 3: I froze all nonlinear layers and fitted the model only on the last linear layer
> Network did not seem to converge at all, not even after 50 epochs
Making the classification task a binary classification task also changed nothing.
class ResNet(FitModule): def __init__(self, num_classes=2): super(ResNet, self).__init__() self.net = torchvision.models.resnet34(pretrained=True) # Change classifier kernel_count = self.net.fc.in_features self.net.fc = nn.Sequential(nn.Linear(kernel_count, 500), nn.Linear(500, num_classes)) self.dropout = nn.Dropout(p=0.5) # Attempt 1: use classifier with dropout layers ''' self.net.fc = nn.Sequential( nn.BatchNorm1d(kernel_count), nn.Dropout(p=0.5), nn.Linear(in_features=kernel_count, out_features=500), nn.ReLU(), nn.BatchNorm1d(500), nn.Dropout(p=0.5), nn.Linear(in_features=500, out_features=num_classes)) ''' def freeze_nonlinear_layers(self): self._freeze_layer(self.net.conv1) self._freeze_layer(self.net.bn1) self._freeze_layer(self.net.relu) self._freeze_layer(self.net.maxpool) self._freeze_layer(self.net.layer1) self._freeze_layer(self.net.layer2) self._freeze_layer(self.net.layer3) self._freeze_layer(self.net.layer4) self._freeze_layer(self.net.avgpool) def _freeze_layer(self, layer, freeze=True): if freeze: for p in layer.parameters(): p.requires_grad = False else: for p in layer.parameters(): p.requires_grad = True def forward(self, inputs): # Attempt 2: build whole network with dropout layers ''' out = self.net.conv1(inputs) out = self.net.bn1(out) out = self.net.relu(out) out = self.net.maxpool(out) out = self.dropout(out) out = self.net.layer1(out) out = self.dropout(out) out = self.net.layer2(out) out = self.dropout(out) out = self.net.layer3(out) out = self.dropout(out) out = self.net.layer4(out) out = self.dropout(out) out = self.net.avgpool(out) out = out.view(out.size(0), -1) out = self.net.fc(out) ''' # Attempt 3: freeze nonlinear layers and only train last linear layer ''' self.freeze_nonlinear_layers ''' return out/self.net(inputs) # attempt 2/attempt 1,3
To sum things up: Either the network does not converge or the validation loss rises and the test accuracy is poor.
Help is much appreciated. Thanks in advance!