I can use the word vector model in txt format as follows:
if not os.path.exists(.vector_cache):
os.mkdir(.vector_cache)
vectors = Vectors(name='myvector/glove/glove.6B.200d.txt')
TEXT.build_vocab(train, vectors=vectors)
However, when i turn to the binary format such as googlenews-vectors-negative300.bin, I got an error: could not convert string to float.
The code is almost the same as above :
if not os.path.exists(.vector_cache):
os.mkdir(.vector_cache)
vectors = Vectors(name='GoogleNews-vectors-negative300.bin')
TEXT.build_vocab(train, vectors=vectors)
so, how to use the word vector model in binary format to build a vocab?
In addition, should we use the vocabulary of the pre-trained model directly, or build a vocabulary from the training set, or build a vocabulary from the training set + test set? I am very confused about this.
Any help will be grateful!