Embeddings shows index out of range if `padding_id` is -1

hiro-o918 · July 2, 2019, 8:30am

Overview

I want to use -1 as an ignore index of an embedding layers.
But it does not work.

Environment

Python: 3.7.1
torch: 1.1.0

To Reproduce

embedding = nn.Embedding(10, 3, padding_idx=-1)
tokens = torch.tensor([-1, 2]).long()
embedding(tokens)

The third line, in the above, shows

RuntimeError: index out of range at ../aten/src/TH/generic/THTensorEvenMoreMath.cpp:193

Deepali · July 2, 2019, 9:24am

>>> embedding = nn.Embedding(10, 3, padding_idx=-1)
>>> tokens = torch.tensor([1, 2]).long()   # positive indexes here
>>> embedding(tokens)
tensor([[-0.6424, -1.2020,  0.4287],
        [ 0.5234,  1.2113,  0.8808]], grad_fn=<EmbeddingBackward>)

hiro-o918 · July 2, 2019, 9:41am

Thank you for your reply.

I think padding_idx=-1 means -1 map to [0., 0., 0.], as the example of https://pytorch.org/docs/stable/nn.html#embedding, but is it not correct?

Probably, does padding_idx=-1 mean there are no padding index?

Deepali · July 2, 2019, 10:36am

I think -ve padding means negative indexing.

blackbirdbarber · July 2, 2019, 10:36am

Just set

tokens = torch.tensor([3, 2]).long()

This may be the embedding index problem <0.
Negative is bad in here.
Also, by default is long so you don’t need to use long()