Implicit dimension choice for softmax warning

Hey guys,
I was following exactly the same as the tutorial says
which official had given on their site.
However, I got stuck on the softmax function which shows no warning according to the tutorial, but my python gives me a warning message it says,

UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
return F.log_softmax(self.linear(bow_vec))

  1. Can anyone of you guys knows what ‘dim’ variable do?
  2. Why does it produce a warning with my python?

A dimension along which softmax will be computed.


output = softmax(input,dim=2)


output.sum(dim=2) is all 1

do you mean if the input of a softmax function is a variable that is comprised of, let’s say, 6 elements then we should clarify that it is dimension of 6
with variable dim=6 ???

No, I mean:

a = t.Tensor(3,4,5)
b = softmax(Variable(a), dim=2)
b.sum(2) # is all 1

I don’t think I’m aware of what ‘dim’ actually means.
On the example you wrote, python says the expected range
of dim is [-3, 2].
Think I’m confused with the definition of what dim actually is in that context since so many meaning exist as a word dim in machine learning.
Could you explain more specifically what that means?


I came across exactly the same error.
Just by following the tutorial , when typing

python -data data/demo -save_model demo-model

it gives me the same warning.

/OpenNMT-py/onmt/modules/ UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.

Can anyone please help as to how can this be resolved?

You could add the dim parameter to the softmax function call, to get rid of this warning.
Have a look at the dimensions of the input to softmax and try to figure out, in which dimension the softmax should be calculated.

1 Like

In this context dim refers to the dimension in which the softmax function will be applied.

>>> a = Variable(torch.randn(5,2))
>>> F.softmax(a, dim=1)
Variable containing:
 0.6360  0.3640
 0.3541  0.6459
 0.2412  0.7588
 0.0860  0.9140
 0.6258  0.3742
[torch.FloatTensor of size 5x2]

>>> F.softmax(a, dim=0)
Variable containing:
 0.6269  0.3177
 0.0543  0.0877
 0.1482  0.4128
 0.0103  0.0969
 0.1603  0.0849
[torch.FloatTensor of size 5x2]

On the first case (using dim=1) the softmax function is applied along the axis 1 . That’s why all rows add up to 1. On the second case (using dim=0) the softmax function is applied along the axis 0. Making all the columns add up to 1.


Just out of curiosity, how did the softmax function determine the dimension before the deprecation? Was it always 1?


For matrices, it’s 1. For others, it’s 0.


If your batch size is say=64, and number of classes you want the CNN/NN to classify is say=10. Then the dimension of output tensor will be 64*10.
Set dim=0 to apply softmax along the rows.

m=nn.Softmax(dim=0) #softmax along row


hmmm might it be a good idea to just throw an error rather a warning and have the code actually stop running? Might save ppl headaches on debugging. I couldn’t even see the warning cuz I was printing a bunch of other stuff…if the code had just failed it would have saved me a bunch of time.

Softmax(dim=1) for linear output.

I kept getting the following error: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
  logpt = F.log_softmax(input)

Then I used dim=1

    #logpt = F.log_softmax(input)
    logpt = F.log_softmax(input, dim=1)

based on Implicit dimension choice for softmax warning - #10 by ruotianluo

but I get this error

train: True test: False
preparing datasets and dataloaders......
creating models......

=>Epoches 1, learning rate = 0.0010000, previous best = 0.0000
feats shape:  torch.Size([64, 419, 512])
labels shape:  torch.Size([64])
Traceback (most recent call last):
  File "", line 347, in <module>
    loss = criterion(m(output[:,1]-output[:,0]), labels.float())
  File "/home/jalal/research/venv/dpcc/lib/python3.8/site-packages/torch/nn/modules/", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "", line 87, in forward
    logpt = F.log_softmax(input, dim=1)
  File "/home/jalal/research/venv/dpcc/lib/python3.8/site-packages/torch/nn/", line 1769, in log_softmax
    ret = input.log_softmax(dim)
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

so how should I find out what dim is?

As given in the new error message, you are trying to specify dim=1 while input has only a single dimension.
Use dim=0 instead and make sure that’s really the expected shape.
Common use cases use at least two dimensions as [batch_size, feature_dim] and use then the log_softmax in the feature dimension, but I’m also not familiar with your use case so input having a single dimension might be alright.

Thanks for your response. It is part of an older code for focal loss here:

        logpt = F.log_softmax(input)

Thanks for the link.
Based on both errors I don’t think you should fix the issues by changing the dim argument in log_softmax and gather, but should instead debug why the input has a single dimension (while more are expected).

1 Like