Softmax gives output vector whose sum is greater than 1

Oormila_Ghantasala · November 14, 2019, 5:55am

Hi,
I am a newbie to PyTorch. I was trying out the following network architecture to train a multi-class classifier. I used Softmax at the output layer and cross entropy as the loss function. However, the output doesn’t look like probabilities. For example, one of the outputs looks like this [2.0032e-10, 1.798e-8, …1.0000e+0,…2.112e-4]. My question is, how can one of them be 1 when their sum has to be equal to 1.

class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        # 13 features in input layer
        self.fc1 = nn.Linear(13, 512)
        self.fc2 = nn.Linear(512, 512)
        self.fc3 = nn.Linear(512, 512)
        self.fc4 = nn.Linear(512, 512)
        self.fc5 = nn.Linear(512, 40)
        self.bn1 = nn.BatchNorm1d(512)
        self.bn2 = nn.BatchNorm1d(512)
        self.bn3 = nn.BatchNorm1d(512)
        self.bn4 = nn.BatchNorm1d(512)
        

    def forward(self, x):
        x = self.bn1(F.relu(self.fc1(x)))
        x = self.bn2(F.relu(self.fc2(x)))
        x = self.bn3(F.relu(self.fc3(x)))
        x = self.bn4(F.relu(self.fc4(x)))
        x = nn.Softmax(dim=1)(self.fc5(x))
        return x

Please correct me if I am wrong and help me out.

ptrblck · November 14, 2019, 6:54am

You are most likely seeing some floating point precision issues.
That being said, note that nn.CrossEntropyLoss expects logits, as internally F.log_softmax and nn.NLLLoss will be applied, so you should remove the softmax for this criterion.

Oormila_Ghantasala · November 14, 2019, 7:08am

If softmax is removed, output range is not between 0 and 1. It contains negative values as well like [-12.098, 2.0988, -12.121…, 0.87, 0.21]. But I need probabilities for each of the classes.

ptrblck · November 14, 2019, 7:10am

nn.CrossEntropyLoss expects these logits.
For debugging purposes you could still apply softmax on the output. Just don’t pass it to the criterion.

Oormila_Ghantasala · November 14, 2019, 7:12am

So how do I get the class probabilities? I need the output as probabilities. How can I achieve that without/with using Softmax?

Oormila_Ghantasala · November 14, 2019, 2:48pm

Solved. I got output as probabilities by sending the predicted values to softmax but didn’t include softmax in the net architecture as cross-entropy already applies softmax internally. Thank you.