EmbeddingBag of vocab size

Why does EmbeddingBag expect length of vocab as number of embeddings?

num_embeddings - size of the dictionary of embeddings

https://pytorch.org/docs/stable/nn.html?highlight=embeddingbag#torch.nn.EmbeddingBag

I’m following the example here https://pytorch.org/tutorials/beginner/text_sentiment_ngrams_tutorial.html

maybe it create a representation for each word, because computer do not understand word, computer understand numbers, so when we do,

x = nn.EmbeddingBag(num_embeddings=10, embedding_dim=3, mode='sum')
list(x.parameters())

it give,

[Parameter containing:
 tensor([[ 0.8823, -0.3787,  0.8360],
         [-1.4388, -0.6124, -1.6967],
         [ 0.4632,  0.6406,  0.1272],
         [-0.8657, -2.0807, -0.9140],
         [-0.3749, -0.5471, -0.5424],
         [ 0.9730,  0.5713,  0.4584],
         [-1.3402,  0.1033, -1.4363],
         [-0.1600, -0.3686, -0.2954],
         [ 1.1288, -0.1282, -1.0070],
         [ 0.8220, -0.0371, -0.7206]], requires_grad=True)]

this means that our vocabulary has 10 words, and each of those words are represented by an array of 3 floating point numbers, so, for a computer, a word would mean these 3 floating point numbers.

so if our vocabulary had 10 words like

apple orange banana grape juice fruit pineapple strawberry mango watermelon

then for computer,

apple

would mean

[ 0.8823, -0.3787,  0.8360]

and

orange

would mean

[-1.4388, -0.6124, -1.6967],

if we want to change the representation of any of these words, then update their embedding.

plus embeddingbag would give us a sum (or we could even get mean) of these embeddings, that is, if we do,

x(torch.LongTensor([[0, 1]]))

then we will get,

tensor([[-0.5565, -0.9911, -0.8608]], grad_fn=<EmbeddingBagBackward>)

that is sum of first two arrays

Yes, I understand it now, thank you. “EmbeddingBag” threw me off, but I see it is similar to having an Embedding layer (so it is of vocab size to index into) except that we can perform an aggregation over it.