Hi, I have come across a problem in Pytorch about embedding in NLP.
Suppose I have |N| sentences with different length, and I set the max_len
is the max length among the sentences, while the other sentences need to pad zeros vectors. Define the Embedding
as below with one extra zero vectors at index vocab_size
emb = nn.Embedding(vocab_size +1, emb_num, padding_idx=vocab_size)
But when I use my pre-train word vectors: I need to construct a vectors including a zero vectors manually:
myw2v=torch.from_numpy('myw2v.model')
zeros = torch.zeros(1, emb_num)
myw2v = torch.cat((myw2v, zeros), 1)
emb.weight.data.copy_(torch.from_numpy(myw2v))
So I’d like to know whether there is a function in pytorch that can handle the situation above?
Thanks.