LSTM training loss does not decrease

One problem could be your loss function, nn.BCEWithLogitsLoss expects raw logits as inputs, not sigmoid activation.