Is Noam scheduling widely used for training transformer-based models?

Do you mean this fairseq.optim.lr_scheduler.inverse_square_root_schedule — fairseq 0.12.2 documentation