Neural Network not training, stuck at 50% accuracy

Training loss remains relatively stable. On test set, it produces an accuracy of 50%, which is akin to the model guessing since it only has 2 classes. I already tried increasing/decreasing model complexity, adjusting hyperparameters, data augmentation, basically anything to get the model to underfit/overfit the data. I can’t tell if there is something wrong with the neural network or with the dataset itself.

class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=5, padding=0)
        self.bn1 = nn.BatchNorm2d(32)  
        self.conv2 = nn.Conv2d(32, 64, kernel_size=5, padding=0)
        self.bn2 = nn.BatchNorm2d(64)  
        self.conv3 = nn.Conv2d(64, 128, kernel_size=5, padding=0)
        self.bn3 = nn.BatchNorm2d(128)   
        self.pool = nn.MaxPool2d(2, 2)
        self.fc1 = nn.Linear(128 * 12 * 8, 512)
        self.fc2 = nn.Linear(512, 256)
        self.fc3 = nn.Linear(256, 2) 

    def forward(self, x):
        x = self.pool(torch.relu(self.conv1(x)))
        x = self.pool(torch.relu(self.conv2(x)))
        x = self.pool(torch.relu(self.conv3(x)))
        x = torch.flatten(x, 1)
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = self.fc3(x)
        return x
num_epochs = 10
for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    correct_train = 0
    total_train = 0
    for images, labels in train_loader:
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
    
    _, predicted = torch.max(outputs.data, 1)
    total_train += labels.size(0)
    correct_train += (predicted == labels).sum().item()

    train_accuracy = 100 * correct_train / total_train
    print(f"Epoch {epoch+1}, Training Loss: {running_loss/len(train_loader)}, Training Accuracy: {train_accuracy}%")
Epoch 1, Training Loss: 0.6989821430408594, Training Accuracy: 52.94117647058823%
Epoch 2, Training Loss: 0.6789375381036238, Training Accuracy: 58.8235294117647%
Epoch 3, Training Loss: 0.6709084140531945, Training Accuracy: 88.23529411764706%
Epoch 4, Training Loss: 0.6927016901247429, Training Accuracy: 52.94117647058823%
Epoch 5, Training Loss: 0.6819337732864149, Training Accuracy: 64.70588235294117%
Epoch 6, Training Loss: 0.6968633731206259, Training Accuracy: 47.05882352941177%
Epoch 7, Training Loss: 0.6873575990850275, Training Accuracy: 52.94117647058823%
Epoch 8, Training Loss: 0.6847923795382181, Training Accuracy: 58.8235294117647%
Epoch 9, Training Loss: 0.683509703838464, Training Accuracy: 64.70588235294117%
Epoch 10, Training Loss: 0.6756617174004064, Training Accuracy: 52.94117647058823%

Accuracy on test set: 50.0%

Where are you defining the loss function and optimizer? Please copy that part of your code.

I define it just before the training loop.

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

Print your labels to see what you’re working with.

You also might be better off using BCEwithlogitsloss. That’s specifically designed for two classes. In that case, you’ll need to frame your labels such that class 0 corresponds with a value of 0 and class 1 corresponds to a value of 1.

1 Like

Here are my labels:

target_to_class = {v: k for k, v in ImageFolder(data_dir).class_to_idx.items()}
print(target_to_class)

{0: 'HCM_None', 1: 'HCM_Present'}

I tried using BCEwithlogitsloss but it created this error:

ValueError: Target size (torch.Size([32])) must be the same as input size (torch.Size([32, 2]))

I added an unsqueeze to try and get rid of the ValueError:

# Training the model
num_epochs = 10
for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0
    for images, labels in train_loader:
        optimizer.zero_grad()
        outputs = model(images)
        labels = labels.unsqueeze(1).float() # unsqueeze
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
    
        predictions = torch.round(outputs)  
        correct += (predictions == labels).sum().item()
        total += labels.size(0)

    train_accuracy = correct / total
    print(f"Epoch {epoch+1}, Training Loss: {running_loss/len(train_loader)}, Training Accuracy: {train_accuracy}%")

The training loss and accuracy still remain the same though.

Epoch 1, Training Loss: 0.670424714232936, Training Accuracy: 0.5110470701248799%
Epoch 2, Training Loss: 0.6699145544659008, Training Accuracy: 0.457252641690682%
Epoch 3, Training Loss: 0.6695302724838257, Training Accuracy: 0.4217098943323727%
Epoch 4, Training Loss: 0.6671686551787637, Training Accuracy: 0.4505283381364073%
Epoch 5, Training Loss: 0.6690011150909193, Training Accuracy: 0.46301633045148893%
Epoch 6, Training Loss: 0.6681144020774148, Training Accuracy: 0.4783861671469741%
Epoch 7, Training Loss: 0.6691685246698784, Training Accuracy: 0.4303554274735831%
Epoch 8, Training Loss: 0.664765780622309, Training Accuracy: 0.5341018251681076%
Epoch 9, Training Loss: 0.6677677992618445, Training Accuracy: 0.42363112391930835%
Epoch 10, Training Loss: 0.6644004493048696, Training Accuracy: 0.4111431316042267%

Maybe post the dataset that you’re using as well? It can be the case that your model is too simple for the complex data samples it needs to learn.

To frame it for BCE, you need the model output size to be 1.

However, it seems there may be some other issue as it should still work for crossentropyloss with dual classes.

How much data do you have? What’s the percentage split between training & validation sets?

Have you tried training for more than 10 epochs?

1 Like

What is your input size? What is the real size of your training samples? 12*8 ? Are you doing data augmentation?