How to choose "dim =0/1" for softmax or logsoftmax


What are criteria for choosing “dim=0 or 1” for nn.Softmax and nn.LogSoftmax

Usually you would like to normalize the probabilities (log probabilities) in the feature dimension (dim1) and treat the samples in the batch independently (dim0).
If you apply F.softmax(logits, dim=1), the probabilities for each sample will sum to 1:

# 4 samples, 2 output classes
logits = torch.randn(4, 2)
print(F.softmax(logits, dim=1))
> tensor([[0.7869, 0.2131],
        [0.4869, 0.5131],
        [0.2928, 0.7072],
        [0.2506, 0.7494]])

@ptrblck I found PyTorch official example use dim=0 for muiticlass classification. Why it use dim=0 here?
The function is as below:

helper functions

def images_to_probs(net, images):
    Generates predictions and corresponding probabilities from a trained
    network and a list of images
    output = net(images)
    # convert output probabilities to predicted class
    _, preds_tensor = torch.max(output, 1)
    preds = np.squeeze(preds_tensor.numpy())
    return preds, [F.softmax(el, dim=0)[i].item() for i, el in zip(preds, output)]

The posted code snippet iterates output so el won’t have the batch dimension anymore and dim=0 will thus use the “class dimension”.

1 Like