I have a text classification model that passes word embeddings through a GRU and the output of that GRU is passed fed into an ANN that gives one class as an output.
The dataset is huge(1.4 million lines) and I am training it on Google Colab. It takes 15 mins to just go through 500 lines of the dataset.
So to speed up training, I decided to use pre-trained Glove vectors instead of the random embeddings that Pytorch provides. How do I implement that?
And any idea abou what should I do about missing words that are not in the Glove vectors? How will they be embedded?
Any other ideas about speeding it up will be appreciated.