Randomly initialized embeddings for torchtext

I’d like to randomly initialize word embeddings - can I do the following:


    TEXT.build_vocab(train_data) 
    vocab_size = len(TEXT.vocab)
    embedding_vectors = torch.FloatTensor(np.random.rand(vocab_size,embedding_length))
    word_embeddings = nn.Embedding(vocab_size, embedding_length)
    word_embeddings.weight = nn.Parameter(embedding_vectors, requires_grad=True)

to do so ?
I have heard tales of a parameter for build_vocab that allows for this out of the gate but have yet to sight it myself.

This code snippet would assign embedding vectors to the nn.Embedding layer.
Note that nn.Embedding will already randomly initialize the weight parameter, but you can of course reassign it.

You could use torch.from_numpy(np.random.rand(...)).float() to avoid a copy, but your code should also work.

1 Like