What is the optimal options for training epoch?

Dear friends,

I have a question about how to set the optimal options for training epoch like (Number of layers, batch size, number of epochs, etc.) to get better results and avoid data overtraining.


There are no optimal, one-size-fits-all settings. When it comes to training a network, there’s a lot of trial-and-error involved. There are whole research papers just about how to set/update the learning rate, for example. Here just some very basic guidelines of mine:

  • Can you overtrain the network on a small dataset? That just to see if the network, loss calculation and backprop is (most likely) working fine.
  • After each eopch, calculate both training and test loss and trace their trends over time. The training loss should essentiall go down and maybe converge at some point. The test loss should also go down, at least at first. If/once the test loss increases again, your network is overtraining, and you might want to stop the training.
  • A good batch size depends on the learning rate and/or vice versa. If the batch size is large, the loss gets averaged over more samples and is therefore typically smaller than for small batch sizes.
  • In the end, you have to try: If the training/test loss is going down but slowly, increase the learning rate. If the tranining/test loss jumps up and down, decrease the learning rate.
  • I’m not sure what you mean by “number of layers” here since it refers to the network model and not the training itself. In general, however, the more complex a network (e.g., in terms of numbers of layers), the more training is needed, both in terms of the dataset and the computation (i.e., epochs).

I hope that helps a bit.


Dear Chris,

Thank you so much, that all I’m looking for. This is my first deep learning project, and there are many things I’ve to learn.