Why Pytorch loss function gives different results than TF?

I’m trying to figure out why PT binary_cross_entropy gives different results than TF’s binary_cross_entropy.


This is not a pytorch related issue, however, if you use from_logits=True in tensorflow’s binary_crossentropy function, you will get the same value as pytorch’s function.

If you check the pytorch docs for binary cross entropy, it can be seen in the example that sigmoid is applied to the outputs before passing into the cross entropy loss function.

>>> loss = F.binary_cross_entropy(F.sigmoid(input), target)

The prediction I provided is already calculated with nn.Sigmoid.

I’ll need to read about from_logits. Thank you.

1 Like