Mysterious behaviour of cross entropy loss

Hello,
I have encountered a very weird behaviour of nn.functional.cross_entropy(pred, label, reduction=‘none’) function. Following is the behaviour I observed. It would be great if someone could explain the reason for this:

Expected behaviour - If for any i label[i] > C where pred is of shape N x C x H x W , cross_entropy should raises an IndexError.

Showcased behaviour -

• If pred and target are cpu tensors. Output : IndexError: Target 20 is out of bounds.
• If pred and target are gpu(cuda) tensors then no error is raised.

Why does cross_entropy shows a discrimination in its behaviour for gpu and cpu tensors? Also the above observed behaviour for cuda tensors is unexpected and isn’t this a bug ?

My environment:
Python : 3.7.7
Pytorch : 1.6.0
Cuda : 10.1

PS: The above behaviour almost consumed 3 hrs of my time. Hope someone would help me understand the reason for this so that I will be cautious next time I deal with gpu tensors.

Below is the code snippet to replicate:

``````import torch
import torch.nn.functional as F
input = torch.randn(1, 19, 32, 64, requires_grad=True)
input_c = input.cuda()
target = torch.randint(22, size=(1,32,64,))
target_c = target.cuda()
F.cross_entropy(input_c, target_c, weight=None, reduction='none', ignore_index=255)
``````

This code should not raise any error. Where as removing the cuda() would raise an error.

I could reproduce the issue. I think you have stumbled upon a pytorch bug in `nll_loss2d`.
I had played around with your example and I could notice that `reduction='none'` could be the reason. Error is triggered for other `reduction` (`mean` / `sum`) option.

Further, the reason could be that, the device assertion is not checked in case of `reduction='none'`.

`THCudaCheck(cudaGetLastError());` is missing incase of `reduction='none'`.

I have raised a bug on behalf of you: nll_loss2d: t >= 0 && t < n_classes assertion is not checked when using GPU tensors and reduction='none' · Issue #49882 · pytorch/pytorch · GitHub

1 Like

Thanks for raising the bug.