They are the same (see the implementation). I think the reason why it isn’t working out for you because log_softmax gives different results depending on shape. The shape of x when passed into log_softmax in forward is different from the shape of logit2.
The dim parameter is new and will be in the next release. The docs are fixed too. Here’s what it says in master, if you build from source:
In [5]: ?torch.nn.functional.log_softmax
Signature: torch.nn.functional.log_softmax(input, dim=None, _stacklevel=3)
Docstring:
Applies a softmax followed by a logarithm.
While mathematically equivalent to log(softmax(x)), doing these two
operations separately is slower, and numerically unstable. This function
uses an alternative formulation to compute the output and gradient correctly.
See :class:`~torch.nn.LogSoftmax` for more details.
Arguments:
input (Variable): input
dim (int): A dimension along which log_softmax will be computed.
File: /data/users/sgross/pytorch/torch/nn/functional.py
Type: function