A weird softmax problem

Lechao_Cheng · September 1, 2020, 7:45am

dst_out is the output.
After I got the NaN loss, I look through the intermediate status and found the negative values in softmax output.
Could anyone explain this?

ptrblck · September 1, 2020, 7:47am

There is no negative output after applying F.softmax, since Y[Y<0] returns an empty tensor.
dst_out would be the logits, which are not bounded to a specific range.

Lechao_Cheng · September 1, 2020, 7:52am

Thanks. But X is inconsistent with Y.

ptrblck · September 1, 2020, 7:57am

I don’t quite understand it. Could you explain what exactly is inconsistent?
Note that the softmax function will map values from [-inf, inf] to [0, 1].

Lechao_Cheng · September 1, 2020, 8:02am

When I perform X = torch.softmax(dst_out,dim=1), X contains negative values shown as line 2.

ptrblck · September 1, 2020, 8:05am

Ah, thanks for the pointer. I didn’t see the torch.softmax in the first line.
Which PyTorch version are you using, as I cannot reproduce this issue with the latest nightly binary:

x = torch.randn(10, 10)
out1 = torch.softmax(x, dim=1)
out2 = F.softmax(x, dim=1)
print((out1 == out2).all())
> True
print(out1[out1<0], out2[out2<0])
> tensor([]) tensor([])

Lechao_Cheng · September 1, 2020, 8:18am

That’s quite weird! I test the codes and got the same results. But it crashed in my framework.

ptrblck · September 1, 2020, 8:20am

That’s really strange indeed.
Could you try to add assert statements to your original code and check for negative outputs again?

PS: you can post code snippets by wrapping them into three backticks ```, which makes debugging easier.

Lechao_Cheng · September 1, 2020, 8:28am

The torch vision is 1.5.0. By adding the assert statement, it hits the point.
By the way, it works when I change “torch.softmax(…)” to “torch.log_softmax(…).exp()” .

ptrblck · September 1, 2020, 10:04am

Could you update to 1.6 and rerun the code?