Target out of bound error not raised on GPU

The following code snippet from the documentation except I changed the target to be out of bound.

loss = nn.CrossEntropyLoss().to(device)
input = torch.randn(3, 2, requires_grad=True)
target = torch.empty(3, dtype=torch.long).random_(15)
output = loss(input, target)

On device = cpu it raises the correct error

# example
IndexError: Target 2 is out of bounds.

However, the same code snippet (with target out of bound) does not raise the same error on device = cuda:0 but returns a loss = 0

I couldn’t figure out why the behavior is as such or did I missed out something.

PyTorch version = 1.5.0

Triggers the expected assertion error for me with torch 1.3.

Technically, you are changing the target to be random, so you have 0.2% chance of generating a valid target…

Technically, you are changing the target to be random, so you have 0.2% chance of generating a valid target…

I did mention in the question that I checked that the target was out of bound.

It triggers this error on my system:

device = 'cuda'
loss = nn.CrossEntropyLoss().to(device)
input = torch.randn(3, 2, requires_grad=True)
target = torch.empty(3, dtype=torch.long).random_(15)
output = loss(input, target)
> IndexError: Target 13 is out of bounds.

@ptrblck actually I forgot to mention I was using the PyTorch Docker Image when I get the behavior mentioned in the question.

I just tested on a host machine and rightly so it does raise an error.
Seems the error happens on Docker image.

Hi,

I got the same unexpected behaviour with PyTorch 1.5.0 using a GPU

loss = torch.nn.CrossEntropyLoss()
weights = torch.randn(10, 5)                                                                                                                         
labels = torch.arange(10)
loss(weights.cuda(), labels.cuda())
Out[13]: tensor(1.7329, device='cuda:0')

The error is triggered if I use the CPU though.

Could you update to the nightly binaries and recheck it, please?
Recently an issue was fixed, which silenced the assert statements in the CUDA code.

Hi,

Sorry I took some time to answer !!
I got this error now, which seems to be appropriate

/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [5,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [6,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [7,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [8,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [9,0,0] Assertion `t >= 0 && t < n_classes` failed.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/leo/Venv/test/lib/python3.7/site-packages/torch/tensor.py", line 154, in __repr__
    return torch._tensor_str._str(self)
  File "/home/leo/Venv/test/lib/python3.7/site-packages/torch/_tensor_str.py", line 333, in _str
    tensor_str = _tensor_str(self, indent)
  File "/home/leo/Venv/test/lib/python3.7/site-packages/torch/_tensor_str.py", line 229, in _tensor_str
    formatter = _Formatter(get_summarized_data(self) if summarize else self)
  File "/home/leo/Venv/test/lib/python3.7/site-packages/torch/_tensor_str.py", line 101, in __init__
    nonzero_finite_vals = torch.masked_select(tensor_view, torch.isfinite(tensor_view) & tensor_view.ne(0))
RuntimeError: copy_if failed to synchronize: cudaErrorAssert: device-side assert triggered

Thanks!