nn.Embedding, Input indices meaning

Jiwon_Kim · February 24, 2019, 3:54am

I’m new to pytorch and learning the Embedding module.

They say ‘input to the module is a list of indices, and the output is the corresponding word embeddings.’’

and I don’t know what they mean ‘corresponding’. Is it ‘mean’ value? or ‘variation’?

I tested it but cannot get good answer.

embedding = nn.Embedding(10, 3) #an Embedding module containing 10 tensors of size 3
input = torch.LongTensor([[1,2,4,5],[4,3,2,9]]) # a batch of 2 samples of 4 indices each
wrd_embedding = embedding(input); print(wrd_embedding)


tensor([[[ 0.7576,  0.7259,  0.0674],
         [-0.0827,  1.8416, -0.4799],
         [-0.2899,  1.4135, -0.0972],
         [ 0.4071, -1.5048,  1.9368]],
         [[-0.2899,  1.4135, -0.0972],
         [-0.6687,  0.5834,  0.0072],
         [-0.0827,  1.8416, -0.4799],
         [-0.4928,  0.5937, -0.1569]]], grad_fn=<EmbeddingBackward>)

print(torch.var(wrd_embedding[:, 0],1))
print(torch.sum(wrd_embedding[:, 0],1))

tensor([0.1518, 0.8701], grad_fn=&lt;VarBackward1&gt;)

tensor([1.5509, 1.0264],grad_fn=&lt;SumBackward2&gt;)

My question : What does the input value, [[1,2,4,5], [4,3,2,9]] do?

gaikin · February 28, 2019, 5:41am

Each item of input, like 1, will be changed to its embeddings. 1 means Embedding layer’s weight first row, like this:

You can get the embed layer weight by embedding.weight.
Search word2vectors to learn more.