Implicit dimension choice for softmax warning

(Lino Hong) #1

Hey guys,
I was following exactly the same as the tutorial says
which official had given on their site.
However, I got stuck on the softmax function which shows no warning according to the tutorial, but my python gives me a warning message it says,

UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
return F.log_softmax(self.linear(bow_vec))

  1. Can anyone of you guys knows what ‘dim’ variable do?
  2. Why does it produce a warning with my python?

(Yun Chen) #2

A dimension along which softmax will be computed.


output = softmax(input,dim=2)


output.sum(dim=2) is all 1

(Lino Hong) #3

do you mean if the input of a softmax function is a variable that is comprised of, let’s say, 6 elements then we should clarify that it is dimension of 6
with variable dim=6 ???

(Yun Chen) #4

No, I mean:

a = t.Tensor(3,4,5)
b = softmax(Variable(a), dim=2)
b.sum(2) # is all 1

(Lino Hong) #5

I don’t think I’m aware of what ‘dim’ actually means.
On the example you wrote, python says the expected range
of dim is [-3, 2].
Think I’m confused with the definition of what dim actually is in that context since so many meaning exist as a word dim in machine learning.
Could you explain more specifically what that means?

(Andrés Herrera) #6

I came across exactly the same error.
Just by following the tutorial , when typing

python -data data/demo -save_model demo-model

it gives me the same warning.

/OpenNMT-py/onmt/modules/ UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.

Can anyone please help as to how can this be resolved?


You could add the dim parameter to the softmax function call, to get rid of this warning.
Have a look at the dimensions of the input to softmax and try to figure out, in which dimension the softmax should be calculated.

(Diego) #8

In this context dim refers to the dimension in which the softmax function will be applied.

>>> a = Variable(torch.randn(5,2))
>>> F.softmax(a, dim=1)
Variable containing:
 0.6360  0.3640
 0.3541  0.6459
 0.2412  0.7588
 0.0860  0.9140
 0.6258  0.3742
[torch.FloatTensor of size 5x2]

>>> F.softmax(a, dim=0)
Variable containing:
 0.6269  0.3177
 0.0543  0.0877
 0.1482  0.4128
 0.0103  0.0969
 0.1603  0.0849
[torch.FloatTensor of size 5x2]

On the first case (using dim=1) the softmax function is applied along the axis 1 . That’s why all rows add up to 1. On the second case (using dim=0) the softmax function is applied along the axis 0. Making all the columns add up to 1.


Just out of curiosity, how did the softmax function determine the dimension before the deprecation? Was it always 1?

(Ruotian(RT) Luo) #10

For matrices, it’s 1. For others, it’s 0.

(Prasad Sogalad) #12

If your batch size is say=64, and number of classes you want the CNN/NN to classify is say=10. Then the dimension of output tensor will be 64*10.
Set dim=0 to apply softmax along the rows.

m=nn.Softmax(dim=0) #softmax along row