More stable softmax with temperature

Abhilash_Srivastava · January 6, 2021, 11:36am

If I understand it correctly:
torch.pow(torch.exp(seq_nll), 0.005) would simply imply torch.exp(seq_nll * 0.005)

So, now you can directly use torch.nn.functional.softmax, with seq_nll * 0.005 as the input.