Loss Not Changing - No Convergence

Hi all,

I’ve been working on a basic model for some time now - it’s a multi-classification problem with images (10). Details are below:
Target = One Hot Encoding || 10 possible outcomes > numbers 0-9
Images are 28x28. They have been normalised with /255 to give a float32 between 0-1.
Training Data is shape [60000, 28, 28] – gray scale images.
Model code is below

from torch import nn, optim
import torch.nn.functional as torch_func

## Set the device ot GPU is available
device = 'cuda' if torch.cuda.is_available() else 'cpu'

model = nn.Sequential(
            nn.Linear(784, 1024),
            nn.Linear(1024, 512),
            nn.Linear(512, 10),

model = model.to(device) ## move it to GPU if available

Implementation of the model below:

learning_rate = 1e-3
optimizer = optim.SGD(model.parameters(), lr=learning_rate)
loss_fn = nn.MSELoss()
n_epochs = 10

for epoch in range(n_epochs):
    for label, img in train_data:
      outputs = model(img.view(-1).unsqueeze(0))
      loss = loss_fn(outputs.squeeze(0), label)
      loss.requires_grad = True
    print("Epoch: %d, Loss: %f" % (epoch, float(loss)))

Output is:

Epoch: 0, Loss: 0.091024
Epoch: 1, Loss: 0.091024
Epoch: 2, Loss: 0.091024
Epoch: 3, Loss: 0.091024
Epoch: 4, Loss: 0.091024
Epoch: 5, Loss: 0.091024
Epoch: 6, Loss: 0.091024
Epoch: 7, Loss: 0.091024
Epoch: 8, Loss: 0.091024
Epoch: 9, Loss: 0.091024

It could be a data prep problem but I am open to any and all suggestions.

Many thanks in advance.


For a multi-class classification use case I would recommend using nn.CrossEntropyLoss instead of nn.MSELoss. You would also remove the nn.Softmax at the end of your model as raw logits are expected.