The accuracy of the convolutional neural network stays the same when the criterion is selected as CrossEntropyLoss

talhak · August 8, 2019, 9:09am

As the title clearly describes, the accuracy of my CNN stays the same when the criterion is selected as CrossEntropyLoss. I especially selected CrossEntropyLoss since only it achieves the test loss close to the training loss. No issues at all for the other loss functions.

Here is the overview of the constructed CNN model:

MyNet(
  (activation_fn): ReLU(inplace)
  (conv1): Sequential(
    (0): Conv2d(3, 16, kernel_size=(5, 5), stride=(1, 1))
    (1): ReLU(inplace)
    (2): Dropout2d(p=0.5)
  )
  (conv2): Sequential(
    (0): Conv2d(16, 32, kernel_size=(5, 5), stride=(1, 1))
    (1): ReLU(inplace)
    (2): Dropout2d(p=0.5)
    (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (conv3): Sequential(
    (0): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1))
    (1): ReLU(inplace)
    (2): Dropout2d(p=0.5)
  )
  (conv4): Sequential(
    (0): Conv2d(64, 128, kernel_size=(5, 5), stride=(1, 1))
    (1): ReLU(inplace)
    (2): Dropout2d(p=0.5)
  )
  (conv5): Sequential(
    (0): Conv2d(128, 256, kernel_size=(5, 5), stride=(1, 1))
    (1): ReLU(inplace)
    (2): Dropout2d(p=0.5)
    (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (fc1): Sequential(
    (0): Linear(in_features=4096, out_features=1600, bias=True)
    (1): Dropout2d(p=0.5)
  )
  (fc2): Sequential(
    (0): Linear(in_features=1600, out_features=400, bias=True)
    (1): Dropout2d(p=0.5)
  )
  (fc3): Sequential(
    (0): Linear(in_features=400, out_features=100, bias=True)
    (1): Dropout2d(p=0.5)
  )
  (fc4): Sequential(
    (0): Linear(in_features=100, out_features=8, bias=True)
  )
)

Here is my test function:

def test():
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)

            output = model(data)
            output = output.to(device)
            test_loss += criterion(output, target).item()
            _, predicted = torch.max(output.data, 1)
            correct += (predicted == target).sum().item()

    test_loss /= math.ceil((len(test_loader.dataset) / test_batch_size))
    test_losses.append(test_loss)

    acc = 100. * correct / len(test_loader.dataset)
    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.2f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset), acc))

ptrblck · August 9, 2019, 12:28am

Which other loss functions did you use?
Was the accuracy better or worse?
Is your current loss changing, while the accuracy stays the same from the beginning?

talhak · August 9, 2019, 8:26pm

Thanks @ptrblck for your care. Well, I used CrossEntropyLoss as I have stated in the OP. The accuracy stays at 50.22% which is worse than NLLLoss but the test loss is a lot better than it. When it comes to current loss, yes, it is changing.

ptrblck · August 9, 2019, 10:32pm

Are you applying F.log_softmax inside your forward method? nn.CrossEntropyLoss expects logits, as the log softmax will be applied internally. If you pass raw logits directly to nn,NLLLoss, I would assume your model performs worse.
Have you played around with some hyperparameters, e.g. the learning rate?

Is your dataset balanced, i.e. is the class distribution approx. equal?

talhak · August 10, 2019, 1:04pm

Yes, I am applying F.log_softmax inside the forward function. When I change learning rate from 0.001 to 0.01, now the accuracy changes a little bit but still it is way lower than nn.NLLLoss. So when using nn.CrossEntropyLoss, should I directly return the output inside the forward function?

Regarding my dataset, it is a publicly available facial expression dataset hosted on GitHub, it is not well balanced due to the nature of the context.

ptrblck · August 10, 2019, 1:48pm

Thanks for the information!
Since you are applying F.log_softmax, you should use nn.NLLLoss as the criterion.

talhak · August 10, 2019, 1:52pm

Not at all, thanks for the feedback. Fair enough, but actually I am testing various loss functions in order to reveal if I might improve the accuracy. So, when using nn.CrossEntropyLoss, should I directly return the output of the FC layers as the last command of the forward function?

ptrblck · August 10, 2019, 1:55pm

nn.CrossEntropyLoss expects logits, so you should return the output of the last layer without any non-linearity.
However, internally nn.CrossEntropyLoss will apply F.log_softmax and nn.NLLLoss as seen here, so you would get the same results.

talhak · August 10, 2019, 2:28pm

Last question, what could be the reason for achieving a high loss despite a good accuracy when using nn.NLLLoss? For the same dataset, the average loss is 46.84 despite the accuracy is 84.19%. And the result does not improve through the epochs - even decreases. I have played around the hyper-parameters and tried different optimizers through the recommendations available on the Internet, but nothing was significantly changed.

ptrblck · August 10, 2019, 2:45pm

Your model might be overfitting to the majority class and thus not learning anything useful.
Could you check your class distribution and see if the majority class occurs approx. In 84% of all cases?
Also check the prediction for unique values.

If you see that the class imbalance might be the reason for this issue, you could use a WeightedRandomSampler or pass class weights to your loss function to counter this effect.

talhak · August 10, 2019, 3:20pm

I see, just integrated the proposed method on another topic here to use WeightedRandomSampler and will post the result here when the training and test phases are completed. Between, could you recommend any utility function to check the class distribution and the prediction for unique values?

ptrblck · August 10, 2019, 4:03pm

I would store the targets in a tensor and call .unique(return_counts=True) on it. The same would work for the predictions (use torch.argmax(output, 1) to get the class predictions).

talhak · August 11, 2019, 5:10am

When I use WeightedRandomSampler or the one published on GitHub named imbalanced-dataset-sampler, the average loss was decreased a lot (now it is ~2), but the accuracy was decreased dramatically as well, which was ~50%, and now is ~2.50%. So, is the only solution simply skipping the used dataset since it is imbalanced, and samplers do not work well?

ptrblck · August 11, 2019, 11:06am

Balancing the dataset should trace the accuracy of the majority class for an increase in the accuracy of the minority classes.
I’ve created a tutorial a while ago here, which is quite old by now, but might give you some insight.