Softmax 2d doesn't make any sense, what am I missing?

Hi, I’m trying to use softamx2d and I can’t see what I’m doing wrong.
I will show my problem using something that will be easier to understand.
I have this 2d matrix of values and I want to make her to a probabilities matrix:

so I’m using this code:


    result = self.softmax(result)

But I’m getting this result, all 0…, take a look:

I can’t understand why? my “sanity-check” is that the sum of element of the entire matrix suppose to sum to 1 (I tried to make the precision bigger, no luck).
I know I’m missing something, can’t understand what.

Please help, Thanks!

Could you post the shape of your matrix?
nn.Softmax2d should apply the softmax using the channel dimension.

You could try the following code:

x = torch.randn(1, 1, 16, 16)
y = nn.Softmax(2)(x.view(1, 1, -1)).view_as(x)
1 Like

I will try this code.
My input shape is :(1,12,16,16), I want each “channel” to get softmax alone.

Edit: tried => y = nn.Softmax(2)(x.view(1, 1, -1)).view_as(x)
Got the same result :confused:

In that case, your code should be fine:

x = torch.randn(1, 12, 16, 16)
y = nn.Softmax2d()(x)

The sum over all channels will be 1 for each pixel position.

But I don’t want across channels, because they are not really channels, I want for each 2d 16x16 to have a softmax alone.means that each 16x16 will sum to 1.

Ah sorry, I misunderstood your use case.
My first code should work then:

y = nn.Softmax(2)(x.view(*x.size()[:2], -1)).view_as(x)
print(y[0, 1].sum())

Are you sure it’s not working?


I think it worked,but isn’t y[0,1] is the first two dimensions of y which are (1,12), and not (16,16) ,which those the ones I would like.
another weird thing is that what I got :

In the first image in the first messege you can see the values before the softmax , isn’t that weird that there is only 1 value bigger then 0? maybe this is the biggest value, but there were some big values like:109/104/101 etc…Isn’t that weird!?

No, y[0, 1].shape will return a 16x16 tensor, so this should be fine.
Also, that’s not really weird but expected, as nn.Softmax()(torch.tensor([130., 109., 104.])) will give you a almost a 1 for the logit of 130. The difference between the logits is just large.
Have a look at the manual implementation:

torch.exp(x - x.max()) / torch.exp(x - x.max()).sum()
1 Like

Ok, got it , thanks a lot!