Transformers GPT2 too large for Google Colabs Pro?

Hello,

I would like to benchmark several PyTorch transformer models on the

  • AG_NEWS dataset (torchtext.datasets)
  • using google colabs pro with 15 GB GPU available memory.

While ‘BERT’,‘DistilBert’,‘Electra’,…,‘XLM-RoBERTa’ and their tokenizers work without a problem, both ‘GPT 2’ and ‘XML’
do not fit in the memory.

Since 15 GB seems to be a lot, I wonder wether

  • it is a coding problem on my site or both models are just very large
  • there is a light-weight version of GPT and XML

Kind regards