I was wondering what kind of embedding is used in the embedding function provided by pytorch. It’s not clear what is actually happening. For example is a pre-trained embedding being used to project the word tokens to its hypothetical space? Is there a distance measure being used? Or is it embedding models like word2vec?
It is just a look up table from indices to vectors. You can manually initialize them however you want, e.g. to work2vec weights.
@SimonW Do you know of a way to access an embedding multiple times? I’m building a language model I’d like to target the word vectors of my output directly to save memory.
To do that I’d need both the input vector and the output vector out of the embedding.
It should be very straightforward to do so. Just calling the embedding(x)
multiple times. Any issues with this approach?
I hadn’t realised you could call it multiple times. I’m working from an example that wraps it and feeds it into an rnn, but what you say makes sense. I’ll give it a try, thanks.
Does that mean the values assigned in embeddings are random?
I don’t understand what you mean. Embedding vectors are typically initialized to random values but are fixed afterwards.
In word2vec, they are not random. Vector values are trained. So you are saying in nn.Embedding, those vectors are just random values?
Even in word2vec, they are initialized to random vectors. Initializing to random and training are not mutually exclusive, but more like always paired together.
I still don’t understand what you mean by they are random values. It’s just looking up the weight matrix, which is typically initialized to random values. Of course you can also change it to other values.
How randomly initialized vectors become vector representation. How the embeddings are calculated~
Backpropagation will update the vectors toward more useful-to-the-task representations. That’s how vectors are learned by word2vec. Instead of using the pre-trained word2vec vectors, you can let your own network update the vectors during training.