Are there any other recommended optimizers for word2vec/glove than Adagrad and SparseAdam?

adagrad and sparseAdam work great for sparse training because there’s separate sums for each of the parameters.

Are there any other recommended optimizers for embedding training?