RNN's and imbalanced data

Hi all, I have signal data on which I’m doing binary classification. I trained a Resnet 50 with 1D convs and achieved 85% accuracy. In the process of improving the performance, I’m currently training a Bidirectional RNN with LSTM. I heard a couple of times that RNNs perform really well for sequential data. But, I was able to only achieve 75% with RNN model. I tried different settings for hidden layers and num_layers in LSTM, but nothing seems to have worked. Any suggestions that could help me? The data is imbalanced with 80-20 distribution. Accuracy is not my primary metric. AUC was 81 for RNN and 93 for RESNET.
Thanks in Advance.

Standard accuracy scores mean very little with a highly unbalanced dataset.

If you have an 80:20 distribution, we can get 80% accuracy with standard scoring methods by just writing a few lines of code to always return the 80% class.

Here is an accuracy method which allows you to get class specific accuracy, or a mean of each class accuracy:

Regarding training on an unbalanced dataset, loss functions allow you to pass in a weight(for binary classes) or pos_weight(for multi-class) argument. These will in turn increase the loss ratio for the minority classes and/or decrease the loss ratio for the majority classes.

1 Like

Thanks Johnson. I’ve tried weighted loss function quite a few times in the past but never had any success with it. But RNN seems to be doing a lot better with weighted loss function than plain loss function.