Cross Entropy Loss between 3d tensors

mfcs · April 30, 2020, 3:08pm

Hi everyone!

Could you help me with the Cross Entropy Loss in PyTorch?

I was looking here on the forum if someone had already made a post with the same question as mine but I didn’t find it, that’s why I’m posting here about Cross Entropy Loss, although there are already many topics about it.

My problem is the following: first the images I’m working with are in the CIE Lab color space. So my CNN accepts the L channel of the image as input and at the end, the CNN’s output is a tensor with the same dimensions as the image of the L channel, however, with 313 layers (313, w, h). For each pixel on the L channel, there will be 313 layers. The target is generated from channels a and b of the image. A probability distribution is also generated for each pair (a, b) of each pixel, as shown in the attached figure.
So I would use Cross Entropy Loss to calculate the loss between these two probability distributions. In this case, each of the 313 color bins, as shown in the image, would be a class that would have a probability for each pixel.
I’ve seen in other posts that you can’t use a probability distribution on the target for Cross Entropy Loss in PyTorch. So how would be this error is calculated? Because the error must be calculated for each pixel in the image and each pixel has its probability distribution. How the target would be represented?
Would be the target a single matrix in which each position of the matrix would have the number of the color bin(the class) with highest value or probability for each pixel in the image? How will Cross Entropy Loss compare a probability distribution with a color bin value that represents a class? The Cross Entropy Loss accepts as input a 3D tensor and perform the loss for a probability distribution for each pixel?

I tried to represent a little of what I wrote in the drawing to try to facilitate understanding.

Thanks for your time and help.

Best regards,

Matheus Santos.

ptrblck · April 30, 2020, 3:18pm

If I understand your use case correctly, KLDivLoss might work.

Let me know, it it’ll work.

mfcs · April 30, 2020, 3:32pm

Thanks! I will read about this Loss Function and I will let you know if it workd

But, in the case of Cross Entropy Loss…does it make sense for the target to be a matrix, in which the elements are the values of the color bins (classes) that have the highest probability, that for each pixel and the CNN’s output to be a tensor with 313 layers , in which each pixel has a probability distribution? Would Cross Entropy Loss calculate the error correctly?

I saw in other posts, some people talking about soft target or probabilities…I understood that this soft target or probabilities was a probability distribution as target but the probabilities was not 0 or 1, instead was values between 0 and 1. This loss that you suggested for me, could be used in this other cases of soft target or probabilities?

ptrblck · April 30, 2020, 3:39pm

This sounds like a classical segmentation use case and should be possible.

Soft targets should also be possible to create using nn.KLDivLoss, and you should also be able to use one’s and zeros as the target values.

mfcs · April 30, 2020, 3:42pm

I see! Thanks!

But how the Cross entropy Loss perform the computation between a probability distribution and a class value instead of the probability of the class?

KFrank · April 30, 2020, 6:47pm

Hello Matheus!

I won’t comment on whether KL-divergence or cross-entropy is
likely to be the better loss function for your use case.

But (assuming I understand what you are asking), no, you can’t
use pytorch’s built-in CrossEntropyLoss with probabilities for
targets (sometimes called soft labels, a term I don’t much like).
It requires integer class labels (even though cross-entropy makes
perfect sense for targets that are probabilities).

However, you can write your own without much difficulty (or loss
of performance). See this post:

Best.

K. Frank

mfcs · May 2, 2020, 12:21am

I see! Thanks for replying!
I will read about all that suggestions!

Just trying some tests, I am trying to convert a tensor like this:
‘’’

1)
tensor([[[ 1., 20.],
         [ 2., 21.],
         [ 3., 22.],
         [ 4., 23.]],

        [[ 5., 24.],
         [ 6., 25.],
         [ 7., 26.],
         [ 8., 27.]],

        [[ 9., 28.],
         [10., 29.],
         [11., 30.],
         [12., 31.]]])

To something like this:

2)
tensor([[ 1., 5., 9.],
        [ 2., 6., 10.],
        [ 3., 7., 11.],
	    [ 4., 8., 12.],
	    [ 20., 24., 28.],
	    [ 21., 25., 29.],
	    [ 22., 26., 30.],
	    [ 23., 27., 31.]])

Each line in the second tensor corresponds to (x,y) element of the first tensor.
The list of values of each line in the second tensor corresponds to the values in the first tensor through layers direction.

Could you help me with this? I am trying using transpose() and reshape() but I am not doing it correctly.