Neural network binary classification softmax logsofmax and loss function

I am building a binary classification where the class I want to predict is present only <2% of times.

The last layer could be logosftmax or softmax.

self.softmax = nn.Softmax(dim=1) or self.softmax = nn.LogSoftmax(dim=1)

my questions

  1. I should use softmax as it will provide outputs that sum up to 1 and I can check performance for various prob thresholds. is that understanding correct?

  2. if I use softmax then can I use cross_entropy loss? This seems to suggest that it is okay to use

  3. if i use logsoftmax then can I use cross_entropy loss? This seems to suggest that I shouldnt.

  4. if I use softmax then is there any better option than cross_entropy loss?

        ` cross_entropy = nn.CrossEntropyLoss(weight=class_wts)`
    

Hi Ni!

Build a model that outputs a single value (per sample in a batch),
typically by using a Linear with out_features = 1 as the final
layer.

This value will be a raw-score logit. Use BCEWithLogitsLoss as your
loss criterion (and do not use a final “activation” such as sigmoid() or
softmax() or log_softmax()).

Either sample your underrepresented class more heavily when training,
e.g., about fifty times more heavily, or weight the underrepresented class
in your loss computation by using BCEWithLogitsLoss's pos_weight
constructor argument with something like:

criterion = torch.nn.BCEWithLogitsLoss (pos_weight = torch.tensor ([50.0]))

Best.

K. Frank

1 Like

could you answer my 4 questions? just yes or no would suffice…
I will also look into your reply and try

Few additional questions:
I understand your suggestion " and do not use a final “activation” such as sigmoid() or
softmax() or log_softmax() )." But what should be my final activation? i looked at linear and it doesnt do anything. it is just a pass through. Could you point the exact function?