MLP regression model always output same value (approaching zero)

Yuchen_Mu · November 22, 2020, 7:40pm

Hi, I tried a MLP with only one hidden layer (20k neurons), when I use Adadelta optimizer, the training loss can converge to almost 0, so the model is overfitted. For the next step, should I still use Adadelta an Rprop optimizer and increase the depth of model to see what happen, or go back and use SGD&Adam?

googlebot · November 23, 2020, 8:11am

You can continue with adadelta. Or check whether switching to “manual lr” optimizers harms training, and do lr tuning to fix that (maybe also adding lr scheduler).

Adapting this, a two layer model with somewhat reduced capacity should continue to train, and be able to generalize to some degree.

Yuchen_Mu · November 23, 2020, 10:03am

Thank you very much! One thing I noticed during experiments with adadelta yesterday is: when I increase the penalty (for example, by increasing the weight_decay (L2 penalty)), the outputs have a trend to perform in the way that all outputs are the same regardless of the inputs, when the penalty is decreased, the model tends to overfit. Anyway, thank you for your patience and help!