Comparison of torch.optim


The torch.optim has a wide an rich selection of both common and more exotique optimizers. I am wondering which optimizers to pick for an object segmentation project.

I was reading through this paper: S. Ruder - An overview of gradient descent optimization algorithms.

And I was thinking if something similar has been published with updated guidelines on more modern optimizers like AdamW, SparseAdam, etc.
In other words, which optimizer should I take?

Thank you for your time