I am not getting the same output when using log_softmax
on class scores. Here’s an example:
output = [[0.0967, -0.1222, -0.0911, -0.1141, 0.0384, 0.0188, -0.0117, 0.1603, 0.1052, 0.0878],
[0.0851, -0.0978, -0.1053, -0.0418, 0.0653, 0.0433, -0.0142, 0.1614, 0.1607, 0.0027]]
y_true = [4, 6]
A numpy softmax:
# numerically stable softmax
f = output.data.numpy()
f = f - np.max(f, axis=1)[:, np.newaxis]
y_hat_softmax = np.exp(f) / np.sum(np.exp(f), axis=1)[:, np.newaxis]
# now log
y_hat_log_softmax = np.log10(y_hat_softmax)
y_hat_log_softmax[0:1]
This prints the following:
array([[-0.96723109, -1.06230795, -1.04878807, -1.05878055, -0.99254912,
-1.00104535, -1.01430881, -0.93959135, -0.96352923, -0.97110444],
[-0.97608072, -1.05549479, -1.05876291, -1.03119493, -0.98465806,
-0.99422187, -1.01919532, -0.94294888, -0.94322318, -1.01187277]], dtype=float32)
Now using pytorch’s log_softmax:
y_hat_soft = F.log_softmax(output)
which prints:
array([[-2.22713184, -2.44605446, -2.41492367, -2.43793225, -2.285429 ,
-2.3049922 , -2.33553243, -2.16348886, -2.21860814, -2.23605061],
[-2.247509 , -2.43036652, -2.43789172, -2.37441397, -2.26725888,
-2.28928041, -2.34678411, -2.17122006, -2.17185163, -2.32992315]], dtype=float32)
Maybe I’m doing something really stupid but I’d appreciate any pointers!