Is Noam scheduling widely used for training transformer-based models?

As in title, I wonder how widely it is used.
Sometimes, it seems to have big impact on training transformer-based models.
But I can’t find it implemented in torch, huggingface transformers, tensorflow.
It’s only implemented in allennlp or opennmt.
It’s little bit wierd to me.