I’ve created a 2-stage lstm based classification model, using mini-batches.
The stages are separate code files.
I have checked the gradients, they are updating. The model seems to be working correct on one dataset. (Merged two labels into label 0, and the third label is label 1, hence a binary classification). Here both training and validation losses go down constantly, and after a point the validation loss stabilises/increases when it starts to overfit. I believe that this pattern is correct and that this classifier is working correctly. Please let me know if it isn’t.
In a second code file:
I take a smaller part of the same testing dataset, consisting of the outputs of the first classifier which were predicted to have label 0. This label consisted of two labels in the actual dataset. I now try to classify this data into those labels.
The problem that I am facing is that everything is getting classified into the same class now after a few epochs. I thought it was due to unbalanced data, but I used weights in the loss function to counter that. I have used dropout/weight decay to prevent overfitting as well. Nothing seems to work that well. Adding the regularisation parameters and decreasing the learning rate slows down the classification, but it ends up in the same class again.
Any suggestions what might be going wrong here? I can share the code if needed.
The dataset is pretty small in size, I hope that is not the issue. 3k training sentences, 1.3k testing.