Unable to get repr for <class 'torch.Tensor'> in CrossEntropyloss

I meet this problem when i used this

loss2 = (1 - weight) * loss_function( student_result, torch.squeeze(label,dim=1))

student_result shape is 4,7,128,128,128
label shape is 4,1,128,128,128
both of them in GPU
student_result comes from model final layer which only has a Conv3d() that changes channel to 7

I have tried to use torch.softmax(student_result,dim=1) to fix it, but it did not work.

When i put both of student_result and label on cpu and comment out with autucast(), it will work fine. So, what should i do to make this code work correctly in gpu and fp16 mode.

Your code works fine:

student_result = torch.randn(4,7,128,128,128, device='cuda', requires_grad=True)
label = torch.randint(0, 7, (4,1,128,128,128), device='cuda')
weight = torch.rand_like(student_result)

loss_function = nn.CrossEntropyLoss()
loss2 = (1 - weight) * loss_function( student_result, torch.squeeze(label,dim=1))

so could you explain the issue in more detail and where this error is thrown?

hi @ptrblc:

Here is my train code:

amp_grad_scaler = GradScaler()
for epoch in range(begin_epoch,end_epoch):
        for i,batch in enumerate(DataLoader):
            with torch.no_grad():
            with autocast():

Error messgae:

Traceback (most recent call last):
  File "/home/XXX/Code_Wrap/Distilling_NNUNET-main/main.py", line 133, in main
  File "/home/XXX/Code_Wrap/Distilling_NNUNET-main/main.py", line 53, in LossFunction
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Although this message tell us loss1 may be wrong, the unable to get repr occured in loss2 that i have pasted above.

The current error now changed to:

RuntimeError: CUDA error: device-side assert triggered

which is raised e.g. if you are passing invalid target tensors to nn.CrossEntropyLoss:

criterion = nn.CrossEntropyLoss()

output = torch.randn(10, 10, device='cuda', requires_grad=True)
target = torch.randint(0, 10, (10,), device='cuda')
target[0] = 10

loss = criterion(output, target)
# ../aten/src/ATen/native/cuda/Loss.cu:271: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [0,0,0] Assertion `t >= 0 && t < n_classes` failed.

Check the target min. and max. values and make sure they are in [0, nb_classes-1].

1 Like