# Cross Entropy Loss not decreasing? 3-category image classification

Hi,

I wonder why the training loss does not decrease?

The input image batch has the shape of: [batch_size1200*200]. The true labels are in “one hot” form: [1,0,0], [0,1,0] and [0,0,1].
I have only 5000 images, so I decided to start from transfer learning.

I loaded pretrained ResNet18 to classify my images into 3 categories.
I change the input channel of ResNet18 to 1, and final output dimension to 3, in order to suit my classification problem.

So the shape of the model output tensor is [batch_size * 3], and the shape of the true label tensor is [batch_size * 3].

Since PyTorch does not provide the CrossEntropy loss function between those two tensors, I wrote my own cross entropy loss function based on the equation:

``````loss = t.mean(-t.sum(target.float() * t.log(y_prediction),dim=1))
``````

Also I am confused about the output of ResNet18: I read somewhere that ResNet18 has softmax layer before output, but the elements of final output did not add up to 1? So I added a softmax layer after ResNet18.

My training code

``````import torch as t
import torch.nn.functional as F
device = t.device('cuda:1' if t.cuda.is_available() else 'cpu')
optimizer = optim.SGD(model.parameters(), lr = 0.01, momentum = 0.9)

images = images.float().to(device)
target = target.float().to(device)

y_prediction = my_resnet18_model(images)
y_prediction = F.softmax(y_prediction, dim = 1)
loss = t.mean(-t.sum(target * t.log(y_prediction), dim = 1)
loss.backward()
optimizer.step()

running_loss += loss.item() * images.shape
epoch_loss = running_loss / len(train_dataset)
print('train loss: {}'.format(epoch_loss))

``````

The epoch loss stayed around 1.09.

Am I using the loss function right?
Could I add the softmax after ResNet18?
How can I improve?

You can just change the target to the label instead of one hot. So 1,0,0 is label 0. 0,1,0 is label 1.

After this you can just use the cross entropy loss in pytorch

1 Like

Okay, I decided to do a two categroy classification first with labels “0” and “1”, and see how it goes.
Thanks!

Well, why not using the Binary Cross Entropy Loss.

``````criterion =  criterion = nn.BCEWithLogitsLoss()
...
...
loss = criterion(y_prediction, target)
``````

Not sure if I am missing something here?

I will take a look at BCE. If it takes tensors with right shape, it is worth a try. Thanks!

It does, as I have been using it for quite some time.
If it does not work for you, maybe your tensors are not structured correctly.

``````BCEWithLogitLoss()
``````

works for my case.
Thanks again!