Pre-trained word vectors - guidance needed

I am looking to use pre-trained word vectors to start a text classifier. There seem to be several pre-trained sets available including word2vec and my question has two parts:

  1. are there any word vectors that are more suited to Pytorch than others. I saw FastText mentioned and wondered whether that is a good starting point
  2. the usual pretrained vector files are very large and containing millions of words, is there a way to manage this, in reality I only need a fairly small fraction of these and don’t want all of my memory being consumed in loading and storing masses of data I don’t need.

Apologies if these are novice questions, I have looked at earlier posts but these don’t seem to answer quite the questions I have raised.

Many thanks in advance


torchtext ( will load only the subset of vectors you’re using, and has GloVe built in, so that is likely the easiest way to get word vectors. But FastText is newer and probably a little better.

Many thanks James, will give it a try, it will be good to see how GloVe works, might not be as up to date as FastText but its going to be better then me previous model