Loss staying the same the whole time during training

I’m using CNNs for age and gender detection. As I’m currently training my model, my loss seems to be staying the same and I’m not sure why. Appreciate any help!

class neural_network(nn.Module):
    def __init__(self):
        super().__init__()
            
        self.conv1 = nn.Conv2d(1, 32, 5)
        self.conv2 = nn.Conv2d(32, 64, 5)
        self.conv3 = nn.Conv2d(64, 128, 5)
#         self.conv4 = nn.Conv2d(128, 256, 3)
        self.flatten = nn.Flatten()
        self.output = nn.Linear(512, 2)

    def forward(self, x):
        x = x.view(-1,1,48,48)
        x = F.max_pool2d(F.relu(self.conv1(x)), (2,2))
        x = F.max_pool2d(F.relu(self.conv2(x)), (2,2))
        x = F.max_pool2d(F.relu(self.conv3(x)), (2,2))
#         x = F.max_pool2d(F.relu(self.conv4(x)), (2,2))
        
        x = self.flatten(x)
        x = self.output(x)
        return F.softmax(x, dim=1)

neural_net = neural_network()

import torch.optim as optim

optimizer = optim.Adam(neural_net.parameters(), lr=1e-4, weight_decay = 5e-5)
loss_function = nn.NLLLoss()

EPOCHS = 3
BATCH_SIZE = 10
y = gender_labels
y = y.type(torch.LongTensor)

correct = 0
predictions = []
correct_labels = []
total = 0

for epoch in range(EPOCHS):
    for i in tqdm(range(0, len(X), BATCH_SIZE)):
        
        optimizer.zero_grad()
        train_X = X[i:i+BATCH_SIZE]
        train_y = y[i:i+BATCH_SIZE]
        output = neural_net(train_X)
        loss = loss_function(output, train_y)
        loss.backward()
        optimizer.step()
        
        with torch.no_grad():
            for idx, i in enumerate(output):
                if torch.argmax(i) == train_y[idx]:
                    correct += 1
                predictions.append(torch.argmax(i).tolist())
                correct_labels.append(train_y[idx].tolist())
                total += 1
    print(loss)

print("Accuracy: ", round(correct/total, 3))

And this is my output:

100%|██████████| 2368/2368 [02:53<00:00, 13.67it/s]
  0%|          | 2/2368 [00:00<02:36, 15.08it/s]
tensor(-1.0000, grad_fn=<NllLossBackward>)
100%|██████████| 2368/2368 [03:08<00:00, 12.54it/s]
  0%|          | 2/2368 [00:00<02:52, 13.74it/s]
tensor(-1.0000, grad_fn=<NllLossBackward>)
100%|██████████| 2368/2368 [03:38<00:00, 10.83it/s]
tensor(-1.0000, grad_fn=<NllLossBackward>)
Accuracy:  0.457

Once again, thanks for the help!

You first have to initialize a model (as in object oriented programming), and then use it. So use the following instead:

# before the epoch-loop
model = neural_network()
...
# in the loop
output = model(train_X)

oh forgot to show me initializing it. I updated the code. The variable name is neural_net() which is different from the neural_network class. I updated the code my bad on that

You may need to put your model in training mode in the training loop. (I’m not sure if that’s the default or not.)

neural_net.train()

For evaluation/testing it should be set to evaluation mode.

neural_net.eval()

made no difference, loss is still staying the same

As NLLLoss documentation says, it works with LogSoftmax inputs.

so would softmax work with adam then?

What Alex means is that in your model you should replace soft max with logsoftmax.

oh forgot to mark it as solution, yeah it works but the loss is around 0.8. im thinking that i might need to change the loss function so that loss would decrease. how do you think i can decrease loss and get it close to 0 since the loss is still very high?