I have a nn.Embedding layer. It maps integers to vectors of some dimension. We can think of it as a matrix times one-hot embedded inputs vectors i.e. torch.matmul(E, x)
. Given a vector y
of the output dimension of E
, I want to compute a probability distribution over the input space i.e. softmax(torch.matmul(E.T, y))
. How do I do this?
why not just retrieve all the embedding manually then multiply by y?
something like
softmax(torch.matmul(E[torch.arange(E.num_embedding).long()], y))?
Would softmax(torch.matmul(E.weight.T, y))?
still preserve backprop to the embedding matrix?
1 Like
oh yeah, that should works too.
1 Like