Using Pre-trained Transformer Model to Expand Short Sentences to Long Sentences

chandra_sutrisno · September 2, 2021, 1:35am

Hi all,

I need to build a model that can expand multiple short sentences to multiple long sentences. I was thinking to use a pre-trained Transformer model to do this just like when we want to do a paragraph or text summarization except, in this case, I switched the output and input values. I tried this using t5-base, ran it on Google Colab, and using really minimum data like 10 rows of data, the idea was to see whether it works or not regardless of the output. But I always got errors like below:

RuntimeError: CUDA out of memory. Tried to allocate 502.00 MiB (GPU 0;
11.17 GiB total capacity; 10.29 GiB already allocated; 237.81 MiB free; 10.49 GiB reserved in total by PyTorch)

I interpret this error as I did something wrong or my idea did not work. Is there anyone who can suggest how to do this?

Please advise

gphilip · September 2, 2021, 3:29am

I am not familiar with how PyTorch NLP models work, but one thing I would do in the setup that I am familiar with, when such an error is reported, is to reduce the batch size.

chandra_sutrisno · September 2, 2021, 4:17am

Hi @gphilip ,

That works, thanks.