Use sampled softmax for seq2seq

I’m training a seq2seq model for machine translation, but since the number of output classes is too high, each iteration takes too long, so I found out about the sampled softmax which promises to speed up the computation process, so anyone did this before?