Timeseries classification (using LSTM-CNN) training loss not decreasing even after increasing model size

It you want to use the probabilities before applying a threshold to create class predictions you could still use torch.sigmoid (just don’t pass it to nn.BCEWithLogitsLoss).
You could also use the raw logits and map the probability to a logit threshold as explained in this post by @KFrank.