Problem in padding_idx of torch.nn.functional.embedding

There may exist a problem of torch.nn.functional.embedding

print(torch.nn.functional.embedding(torch.LongTensor([[0,2,0,5]]), torch.rand(10, 3), padding_idx=0))

The output embedding vectors are not zero if the corresponding index=0.


I can confirm that, and this also happens with the nn.embedding version:

predefined_weights = torch.rand(10, 3)

emb = torch.nn.Embedding(num_embeddings=10, embedding_dim=3, padding_idx=0, _weight=predefined_weights)


It “pads” with the vector at idx=0 from predefined_weights instead of a zero vector.
From the code, it looks like it zeros-out the vector in padding_idx in reset_parameters but only
if _weight is None
otherwise, it uses the given _weight parameter.

IMO, the behavior is fine and the documentation should adjust, cause if you provide your own weights, you should get full control over the weights, including the padding vector.


Thanks a lot.
I agree the predefined weight can better control the behavior.
But the relevant definition of the argument _weight should be added in the official document anyway.

1 Like

Would you mind creating an issue on GitHub with this suggestion?
Also, would you or @RoySadaka be interested in providing the PR for it? :slight_smile:

Sure ill do it, thank you!

1 Like


1 Like