Neural network binary classification softmax logsofmax and loss function

ni_tempe · March 3, 2022, 4:01pm

I am building a binary classification where the class I want to predict is present only <2% of times.

The last layer could be logosftmax or softmax.

self.softmax = nn.Softmax(dim=1) or self.softmax = nn.LogSoftmax(dim=1)

my questions

I should use softmax as it will provide outputs that sum up to 1 and I can check performance for various prob thresholds. is that understanding correct?
if I use softmax then can I use cross_entropy loss? This seems to suggest that it is okay to use
if i use logsoftmax then can I use cross_entropy loss? This seems to suggest that I shouldnt.
if I use softmax then is there any better option than cross_entropy loss?
```
    ` cross_entropy = nn.CrossEntropyLoss(weight=class_wts)`
```

KFrank · March 3, 2022, 4:49pm

Hi Ni!

Build a model that outputs a single value (per sample in a batch),
typically by using a Linear with out_features = 1 as the final
layer.

This value will be a raw-score logit. Use BCEWithLogitsLoss as your
loss criterion (and do not use a final “activation” such as sigmoid() or
softmax() or log_softmax()).

Either sample your underrepresented class more heavily when training,
e.g., about fifty times more heavily, or weight the underrepresented class
in your loss computation by using BCEWithLogitsLoss’s pos_weight
constructor argument with something like:

criterion = torch.nn.BCEWithLogitsLoss (pos_weight = torch.tensor ([50.0]))

Best.

K. Frank

ni_tempe · March 3, 2022, 5:02pm

could you answer my 4 questions? just yes or no would suffice…
I will also look into your reply and try

Few additional questions:
I understand your suggestion " and do not use a final “activation” such as sigmoid() or
softmax() or log_softmax() )." But what should be my final activation? i looked at linear and it doesnt do anything. it is just a pass through. Could you point the exact function?