Hi @undefined,
When using the SoftMax function, you’re predicting a class given an input (which for your model predicts the first class, out of 2 available classes).
In the second case, you’re just outputting the logits and if you fit that to your model, you’re no longer predicting classes based on a probability, but just fitting the output (which isn’t the same).
If you want to read more info, there’s a nice thread with more information here: Logits vs. log-softmax - #2 by KFrank