DNN Cannot Stop Overfitting

I am training a DNN (CNN + RNN) for a voice conversion task. Although my train loss can be very low with good performance, I believe I am experiencing massive overfitting. To overcome this, I have already added quite a bit of batch norm and dropout inside the model as well as weight decay — however, the model still continues to overfit a lot. I present some of my loss curves below:

With a weight decay constant of 1e-7:
With a weight decay constant of 1e-7

With a weight decay constant of 1e-2:
With a weight decay constant of 1e-2

Note that I noticed that if the weight decay constant is > 1e-4, the model seems to experience underfitting.

I want to know what else can I do to improve this model’s generalization. Is it just a matter of more data, or do I need to modify my DNN architecture in some way. I have been struggling with this overfitting problem for some days now, and any insight would be a help.

  1. Make sure your data is split appropriately between the train and the validation set. Use shuffle with stratified sampling if possible.
  2. Experiment with learning rate (lr). The spikes in your dev_mse curve could be due to lr.
  3. Which optimizer are you using? Adam or RAdam optimizer should work well in most cases.
  4. Since you’ve already used experimented with dropout, you can also try using L1/L2 regularizations.