Hello!
I need to fast converge a neural network in a limited number of epochs. Image classification: CNN
What I discovered:
The network should be shallow.
Learning rate balance is needed to converge fast.
Could you give me the fastest converging optimizer and recommendation for learning rate and activation function in the last layer to finish what I need?
Fast.ai performed some interesting experiments with super-convergence and released a blog post about it here. Maybe it could also be applicable to your use case.
If accuracy is important to you, SGD is certainly a safe bet. However, it is also incredibly slow.
But nothing in “the book” says you can’t start with a faster optim, such as Adagrad or Adadelta, and then switch to SGD for the final few epochs, just to squeeze that last bit of accuracy out of a model. I find this to be the best approach.