I am trying to do semantic segmentation. I have label encoded the rgb masks and hence have a ground truth of shape [batch, height, width].
When I try to use cross-entropy loss with my predictions, which are [batch, channels, height, width], I get “cuda runtime error (59) : device-side assert triggered at /pytorch/aten/src/THCUNN/generic/SpatialClassNLLCriterion.cu:128”
Please help me understand where I am going wrong.
loss = F.cross_entropy(pred, labels.squeeze(1))
The labels are of size [batch, 1, height, width], hence I used .squeeze(1).
You don’t need the channel dimension for the labels.
nn.CrossEntropyLoss expects the labels to have the shape
[batch_size, height, width] in your semantic segmentation use case, containing class indices.
It looks like you can just pass
labels without any modification.
Thanks for the reply.
The shape of the labels I get from the dataloader are [batch, 1, height, width]. Do you suggest I should pass it as it is to cross entropy?
No, in that case you would have to squeeze
labels = labels.squeeze(1) and pass it to the criterion.
Sure. I am already doing that. However I am getting the error.
Have a look at the stack trace, maybe you can make a better sense of it.
Could you try to run your code on CPU and see, if you get a better error message?
Due to the asynchronous CUDA calls, the stack trace might point to a wrong line of code.
You could also run your code using
CUDA_LAUNCH_BLOCKING=1 python script.py args to get a valid stack trace.
I guess the class indices might be out of bounds, i.e. your
labels should contain indices in the range
[0, nb_classes-1]. You could add a print statement in your training loop and check the min and max values of your
Excellent suggestion. I wonder why didn’t I think of that. Will update you in some time.
Thanks for the help.
Thanks for the info. The class indices in
labels are indeed out of bounds.
Make sure to only provide labels in the range
[0, nb_classes-1]. E.g. if you are dealing with 5 classes for your segmentation task, your labels should only contain the values
[0, 1, 2, 3, 4].
Thanks for the info. Will surely fix this.
Appreciate your early response.
Found the issue. The label encoding was perfectly fine. The mistake I was doing was normalising the images using
transforms. This was distorting the values.
Thanks for the support. Really appreciate.
Things have started to work. However, I am getting the
crossentropy loss as
0.00000 all the time. Can you suggest what could have gone wrong!
How did you compute
pred, as it should contain the class logits without any non-linearity applied onto them.
Could you print some samples of
These are some of the samples. Where am I going wrong?
It looks alright, assuming that you rescale your
labels to class indices as in
Have you tried to visualize a prediction of your model as a sanity check?
Yeah, It looks sane with some random colors. However, should the loss be 0? the loss begins with
0.2 for the first epoch’s training, but plateaus at
0 during it’s validation phase itself.
If the loss is that low, you could try to print and visualize it in logarithmic scale.
torch.log on your loss and see how it develops.
Will try this. I am not doing anything wrong here, am I? I mean, the samples look alright to you?
Actually, this is the first time I am doing segmentation and all the code available online are two channels classification.
By the way, I am calculating loss like this:
ce = F.cross_entropy(pred, target.squeeze(1).long() * 255)