What is the exactly implementation of torch.embedding?

I am currently working on class Embedding() in PyTorch and I looked at its implementation.

In the forward() method, it calls the F.embedding() method: https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/sparse.py#L124.

Then I found that F.embedding() method finally calls the torch.embedding() method: https://github.com/pytorch/pytorch/blob/master/torch/nn/functional.py#L1814.

However, I could not find where is the implementation of the method torch.embedding().

It should eventually call into this method for the forward pass.


I’m trying to understand the code here.
Can you help me understand why Embedding needs to use index_select, rather than simple [...]?
I also wonder why it has to have its own backwards method? Can’t autograd figure out the indexing?

This would be Python syntax and won’t work in C++, see Libtorch Indexing.

The backwards methods might have been added for specific checks (such as sparsity) and are hooked into Autograd.

I was wondering about index_select in case it would allow me to get a sparse gradient if I used it instead of […] in python. But this doesn’t seem to work?

I think you are saying that autograd is able to make sparse gradients itself, but that the Embedding class added a custom backwards method here because they wanted extra checks?

Is there an indexing function that provides sparse gradients? Or should I implement indexing by means of a sparse matrix multiplication in order to achieve this?