Channel wise cross entropy issue

Anshumaan_Dash · February 12, 2019, 1:09pm

Hi,

I am trying to do semantic segmentation. I have label encoded the rgb masks and hence have a ground truth of shape [batch, height, width].

When I try to use cross-entropy loss with my predictions, which are [batch, channels, height, width], I get “cuda runtime error (59) : device-side assert triggered at /pytorch/aten/src/THCUNN/generic/SpatialClassNLLCriterion.cu:128”

Please help me understand where I am going wrong.

loss = F.cross_entropy(pred, labels.squeeze(1))

Anshumaan_Dash · February 12, 2019, 1:10pm

The labels are of size [batch, 1, height, width], hence I used .squeeze(1).

ptrblck · February 12, 2019, 1:16pm

You don’t need the channel dimension for the labels. nn.CrossEntropyLoss expects the labels to have the shape [batch_size, height, width] in your semantic segmentation use case, containing class indices.
It looks like you can just pass labels without any modification.

Anshumaan_Dash · February 12, 2019, 1:22pm

Thanks for the reply.

The shape of the labels I get from the dataloader are [batch, 1, height, width]. Do you suggest I should pass it as it is to cross entropy?

ptrblck · February 12, 2019, 1:23pm

No, in that case you would have to squeeze dim1: labels = labels.squeeze(1) and pass it to the criterion.

Anshumaan_Dash · February 12, 2019, 1:27pm

Sure. I am already doing that. However I am getting the error.

Have a look at the stack trace, maybe you can make a better sense of it.

ptrblck · February 12, 2019, 1:30pm

Could you try to run your code on CPU and see, if you get a better error message?
Due to the asynchronous CUDA calls, the stack trace might point to a wrong line of code.
You could also run your code using CUDA_LAUNCH_BLOCKING=1 python script.py args to get a valid stack trace.

I guess the class indices might be out of bounds, i.e. your labels should contain indices in the range [0, nb_classes-1]. You could add a print statement in your training loop and check the min and max values of your labels.

Anshumaan_Dash · February 12, 2019, 1:31pm

Excellent suggestion. I wonder why didn’t I think of that. Will update you in some time.

Thanks for the help.

Anshumaan_Dash · February 12, 2019, 1:38pm

This is the error.

ptrblck · February 12, 2019, 1:40pm

Thanks for the info. The class indices in labels are indeed out of bounds.
Make sure to only provide labels in the range [0, nb_classes-1]. E.g. if you are dealing with 5 classes for your segmentation task, your labels should only contain the values [0, 1, 2, 3, 4].

Anshumaan_Dash · February 12, 2019, 1:42pm

Thanks for the info. Will surely fix this.
Appreciate your early response.

Anshumaan_Dash · February 12, 2019, 6:19pm

Hey,

Found the issue. The label encoding was perfectly fine. The mistake I was doing was normalising the images using transforms. This was distorting the values.

Thanks for the support. Really appreciate.

Anshumaan_Dash · February 13, 2019, 6:46pm

Hi,

Things have started to work. However, I am getting the crossentropy loss as 0.00000 all the time. Can you suggest what could have gone wrong!

Thanks.

ptrblck · February 13, 2019, 6:48pm

How did you compute pred, as it should contain the class logits without any non-linearity applied onto them.
Could you print some samples of pred and labels?

Anshumaan_Dash · February 13, 2019, 6:52pm

Anshumaan_Dash · February 13, 2019, 7:01pm

These are some of the samples. Where am I going wrong?

ptrblck · February 14, 2019, 11:20am

It looks alright, assuming that you rescale your labels to class indices as in torch.long format.
Have you tried to visualize a prediction of your model as a sanity check?

Anshumaan_Dash · February 14, 2019, 11:33am

Yeah, It looks sane with some random colors. However, should the loss be 0? the loss begins with 0.2 for the first epoch’s training, but plateaus at 0 during it’s validation phase itself.

ptrblck · February 14, 2019, 11:35am

If the loss is that low, you could try to print and visualize it in logarithmic scale.
Just call torch.log on your loss and see how it develops.

Anshumaan_Dash · February 14, 2019, 11:40am

Will try this. I am not doing anything wrong here, am I? I mean, the samples look alright to you?

Actually, this is the first time I am doing segmentation and all the code available online are two channels classification.

By the way, I am calculating loss like this: ce = F.cross_entropy(pred, target.squeeze(1).long() * 255)