Hey Li-Syuan,
I think I the answer you were trying to get is “it’s random”.
I was thinking exactly the same things as you did.
The nn.embedding simply gives you a random tensor corresponding to each input id of a word, which then can be updated by your downstream task.
nn.embedding takes two necessary arguments, right?
The first one is basically the size of vocab, which you can very arbitrarily set; let’s say you picked 3.
The second one is the input dimension, which you can also very arbitrarily set; let’s say you picked 5.
Once you set it, then under the hood, you can think of nn.Embedding as this:
{
0: [0.123, 0.223, -.123, -.123, .322] # a completely random 5-dimensional representation of whatever token 0 corresponds to (it’s 5 dimensionl because you set it to 5)
1: [0.45, .123123, .123123, .123123, .123123], # a completely random 5-dimensional representation of whatever token 1 corresponds to, (it’s 5 dimensionl because you set it to 5)
2: [0.656, .4564, .456456, .456456, .4564] # a completely random 5-dimensional representation of whatever token 2 corresponds to, (it’s 5 dimensionl because you set it to 5)
}
There are only three entries in the dict because you said vocabulary size is 3.
Then, when you pass a text
You have some sort of tokenizer that maps text into indexes
say your tokenizer does this:
{
“I”: 0,
“love”: 1,
"cat: 2
}
Then, when you pass the text “I love cat” to your tokenizer, it becomes [0, 1, 2]
This [0, 1, 2], when passed to the nn.Embedding
becomes a tensor like this:
[ [0.123, 0.223, -.123, -.123, .322],
[0.45, .123123, .123123, .123123, .123123],
[0.656, .4564, .456456, .456456, .4564]]
If you had an arbitray task, say “is this a gramatically correct sentence”? 1= yes, 0 = no
then your model will learn something like this:
I love cat → label is 1
[ [0.123, 0.223, -.123, -.123, .322],
[0.45, .123123, .123123, .123123, .123123],
[0.656, .4564, .456456, .456456, .4564]]
→ 1
love I cat → no
[[0.45, .123123, .123123, .123123, .123123],
[0.123, 0.223, -.123, -.123, .322],
[0.656, .4564, .456456, .456456, .4564]]
→ no
…
and the random representations of each word would be updated accordingly so they are no longer random.
what happens if you pass an index larger than 2?
well, it’s out of vocabury because you only prepared 3 spots (index 0, 1, 2) in your nn.Embedding and you’ll get an error.
I hope that helped.
The same question bothered me for very long too!