Implicit dimension choice for softmax warning

Diego · February 27, 2018, 5:55pm

In this context dim refers to the dimension in which the softmax function will be applied.

>>> a = Variable(torch.randn(5,2))
>>> F.softmax(a, dim=1)
Variable containing:
 0.6360  0.3640
 0.3541  0.6459
 0.2412  0.7588
 0.0860  0.9140
 0.6258  0.3742
[torch.FloatTensor of size 5x2]

>>> F.softmax(a, dim=0)
Variable containing:
 0.6269  0.3177
 0.0543  0.0877
 0.1482  0.4128
 0.0103  0.0969
 0.1603  0.0849
[torch.FloatTensor of size 5x2]

On the first case (using dim=1) the softmax function is applied along the axis 1 . That’s why all rows add up to 1. On the second case (using dim=0) the softmax function is applied along the axis 0. Making all the columns add up to 1.