Differences between nn.Embedding and nn.Parameters?

ahhhh · May 6, 2018, 8:38pm

Hi there,

What’s the differences between nn.Embedding and nn.Parameters when I only need to access the the word embedding by indices?

Suppose:

the word indices are W = [1, 2, 3, 4, 5, 0], which is a LongTensor.
E = nn.Embedding(10, 2)
P = torch.Tensor(10, 2).uniform_(-0.1, 0.1).requires_grad_()

If I want to access the embedding of the corresponding words. what’s the differences between the following two ways:

E(W)

and

P[W, :]

?

Any efficiency differences?

Carsten_Ditzel · January 8, 2021, 4:24pm

I also would like to know the difference

Aray · April 21, 2022, 1:13pm

Here is a nice explanation:

they are actually all the same underneath, just a trainable matrix (linear comes with an extra bias tensor). however, they have wrappers that allow them to behave differently when you give it an input.

nn.Embedding selects the rows of the given matrix, given a list of integers

nn.Linear does the einsum operation ...d, d e -> ...e

nn.Parameter basically just makes a tensor trainable (receive gradients and updates on step). this is the lowest level you can go, so actually, you can define your entire deep neural network with just nn.Parameters and manually do all the above with gathers and matrix multiplies or einsums