What's the largest model you train with PyTorch?

Hello, can you folks please let me know what’s the largest model you train with PyTorch?
I want to measure power consumption with & without a few tweaks in the PyTorch source code.
I’m motivated by the ACL 2019 paper Energy and Policy Considerations for Deep Learning in NLP.
Thank you!