Hi,

What are criteria for choosing “dim=0 or 1” for nn.Softmax and nn.LogSoftmax

Hi,

What are criteria for choosing “dim=0 or 1” for nn.Softmax and nn.LogSoftmax

Usually you would like to normalize the probabilities (log probabilities) in the feature dimension (dim1) and treat the samples in the batch independently (dim0).

If you apply `F.softmax(logits, dim=1)`

, the probabilities for each sample will sum to 1:

```
# 4 samples, 2 output classes
logits = torch.randn(4, 2)
print(F.softmax(logits, dim=1))
> tensor([[0.7869, 0.2131],
[0.4869, 0.5131],
[0.2928, 0.7072],
[0.2506, 0.7494]])
```

7 Likes

@ptrblck I found PyTorch official example use dim=0 for muiticlass classification. Why it use dim=0 here?

https://pytorch.org/tutorials/intermediate/tensorboard_tutorial.html

The function is as below:

```
def images_to_probs(net, images):
'''
Generates predictions and corresponding probabilities from a trained
network and a list of images
'''
output = net(images)
# convert output probabilities to predicted class
_, preds_tensor = torch.max(output, 1)
preds = np.squeeze(preds_tensor.numpy())
return preds, [F.softmax(el, dim=0)[i].item() for i, el in zip(preds, output)]
```

The posted code snippet iterates `output`

so `el`

won’t have the batch dimension anymore and `dim=0`

will thus use the “class dimension”.

1 Like