Slow training of seq2seq models

FAhtisham · August 5, 2021, 3:11pm

Hello everyone,

I have been trying to reuse the seq2seq model (NLP From Scratch: Translation with a Sequence to Sequence Network and Attention — PyTorch Tutorials 1.9.0+cu102 documentation) for the purpose of DNA sequence translation that are usually very long (appx 1275 words in a single sequence). But the problem is that it takes alot of time train such a model, as the model itself is very simple in nature (GRUs + Luong Attention). If I try to see the gpu usage it always say 10-15 percent. How can I make it use more gpu and accelerate the training process ?

Thanking you in advance for any leads

PS: The GPU usage for this example (NLP From Scratch: Translation with a Sequence to Sequence Network and Attention — PyTorch Tutorials 1.9.0+cu102 documentation) is also very less but the computations are very fast bcz the sentence length is only 10.