I was checking the C code for LogSoftmax, then I came to this line,
Now LogSoftmax can be expressed as, x_i - log( exp(x).sum() )
But what is the significance of adding maxinput
with log( exp(x).sum() )
?
taha
(Taha)
2
Subtracting max_input
from the argument of the exp
function prevents numerical overflow due to large numbers, see LogSumExp in Wikipedia.