Compute mse_loss() with softmax()

KFrank · November 22, 2021, 6:32pm

Hi Mukesh!

I’m not familiar with your use case and I haven’t looked at the paper
you cite, so what I say will be speculation. Nonetheless:

The term “confidence map” (together with the use of Softmax)
suggests to me that you might be dealing with probability-like
values. If so, cross-entropy (or some other probability-comparison
metric such as the Kullback-Leibler divergence) might be more
appropriate.

Your discussion suggests that your target heatmap is not made up
of categorical labels, but rather, of continuous probability-like values.

Pytorch’s built-in CrossEntropyLoss does not support such “soft”
labels, although they do make perfect sense. If you want to explore
the cross-entropy approach, you will have to write your own “soft-label”
version of cross entropy, which is easy enough to do as described here:

Note, when doing this you still do not want a final Softmax layer.
The input to softXEnt() will be the output of your final Linear
layer, understood to be raw-score logits. The target to softXEnt()
will, however, be probability-like values that range over [0.0, 1.0].

I do have some questions about the shapes of your input and
heatmap.

First, what do the dimensions mean?

Second, you mention “17 confidence maps” while the second
dimension of your shapes is 16. Should those values be the same,
or is 16 an unrelated value with a different meaning?

Last, regardless of the meanings of the dimensions, your input and
heatmap have different shapes, which is logically inappropriate for
MSELoss. (MSELoss will broadcast, but you probably don’t want
that.) Why is the third dimension of input 2, and is that really what
you want?

Best.

K. Frank