I am training a model of predicting the nationality from the name, so I have 95 countries as labels.
But out of those 95 the model is always giving only among 5 or 6 countries as the output.
I have read some posts for this, @ptrblck says that it can be due to high learning rate(this could be the reason as I have high learning rate), some say that it is due to “batch_normalization” (Although, I haven’t used it in my model).
But I have used dropout(I have 3 layers for unigrams, bigrams and trigrams) and gradient clipping, I am wondering that if these 2 factors are also contributing to this problem or not?
PS :- Also I am decaying the learning rate by a decay rate of 0.99 per epoch, but still a high learning rate is there for about 150 epochs.
Any help is appreciated
Thanks in advance.
Have you tried to play around with the learning rate a bit?
The overfitting could also be due to an imbalanced dataset, but I’m not sure if that’s the case for your use case.
How many samples of each class do you have?
Thanks for the response, I haven’t got time to check the imbalance in the dataset, I was thinking of this reason as well.
Also I have played it a little bit, decreasing the learning rate does increases the validation accuracy, but there is a research paper I am trying to implement and they started with a high LR, so I did the same just to see the difference. With high LR, the accuracy has decreased but due to slow decay the accuracy is rising but not risen up to the accuracy achieved with smaller learning rate I trained with earlier yet . Upon checking the results of validation and testing, I think it might be a data imbalance but I was wondering if gradient clipping and dropouts is also contributing to this problem or not.
P.S. I employed gradient clipping only because I was using LSTMs. And yeah, I am doing model.train() and model.eval() switches.
You are right, I am having a great data imbalance(far too great). But still I am able to train it well now.
Many Thanks for your suggestions, but still the original question is still unanswered!!