Your custom function returns the same output as F.softmax
:
x = torch.randn(5, 10)
output = F.softmax(x, 1)
maxes = torch.max(x, 1, keepdim=True)[0]
x_exp = torch.exp(x-maxes)
x_exp_sum = torch.sum(x_exp, 1, keepdim=True)
output_custom = x_exp/x_exp_sum
print(torch.allclose(output, output_custom))
> True
print(torch.sum(torch.abs(output-output_custom)))
> tensor(2.3108e-7)