Both nn.Linear
and nn.Embedding
will given you, in your example, a 3-dim vector. That’s the whole point, i.e., to convert a word into an ideally meaningful vectors (i.e., a numeric and fix-sized representation of a word). The difference is w.r.t. the input
-
nn.Linear
expects a one-hot vector of the size of the vocabulary with the single 1 at the index representing the specific word -
nn.Embedding
just expects this index (and not a whole vector)
However, if both nn.Linear
and nn.Embedding
would be initialized with the same weights, their outputs would be exactly the same.
Yes, by default, the weights of both layers will be modified during the training process. In this respect, there are like any other layers in your network. However, you can tell the network not to modify the weights of any specific layer; I think it would look something like this:
embedding = nn.Embedding(10, 3)
embedding = weight.requires_grad = False
This makes sense if you use pretrained word embeddings such as Word2Vec or Glove. If you initialize your weights randomly, you certainly want them to be modified during training.