I am using a pre-trained mobilenetv2 model running experiments for 30 epochs with the learning rate being adjusted by 0.1 every 10 epochs and these hyperparameters:
learning rate = 0.001
weight decay = 4e-5
momentum = 0.9
batch size = 64
optimizer = SGD
dropout = 0.2
I am using class_weights since my training dataset is imbalanced, but my test dataset is perfectly balanced. The formula for determining each weight is minority class size/class size. I am getting good results as my confusion report is showing that each class is generating accuracies in the 80s and 90s. The issue I am facing is that after a while my model will start to fluctuate, for example in my latest experiment after 11 epochs the accuracy keep bouncing between 87% and 91%.
What I have tried so far:
doubling the weight of the minority class, so the model can learn it as well as the others.
changing the learning rate: either 0.01 or 0.0001
changing how much the learning rate is adjusted