Binary clasification - unbalanced data help with the last layer and the loss

I have a NLP binary classification problem with class imbalance. Class 2 is present <2% of times
I am using glove embedding to convert text to numbers.

My earlier questions advised me to not use the last layer as softmax, logsoftmax or sigmoid. Also I shouldn’t use CrossEntropyLoss.

I am not sure what should be my last layer and my loss function. Would appreciate suggestions:

My current code as below:

My network’s last few layers are as below:

        self.batch_norm2 = nn.BatchNorm1d(num_filters)
        
        self.fc2 = nn.Linear(np.sum(num_filters), fc2_neurons)
        
        self.batch_norm3 = nn.BatchNorm1d(fc2_neurons)
        
        self.fc3 = nn.Linear(fc2_neurons, 2)
             
        self.softmax = nn.Softmax(dim=1)

question 1) should I replace the last 2 lines from above with the below. Let me know if there are any other choice.
I am using sigmoid after linear as I will get values between 0 and 1 and then I could use different probability cutoffs if required

self.fc3 = nn.Linear(fc2_neurons, 1)
self.sigmoid=nn.Sigmoid()

Loss as below

cross_entropy = nn.CrossEntropyLoss(weight=class_wts)

question 2) And Loss with the line shown as below? Let me know if there are any other choice

#as class_wts have class weights
BCE_loss=nn.BCELoss(pos_weight = torch.tensor (class_wts[1]/class_wts[0]))