Embedding returns an error that I dont understand

zeng · March 15, 2017, 7:39am

a = torch.Tensor(10)
a = Variable(a, requires_grad=True)
embedding = nn.Embedding(10, 100)
output = embedding(a)

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-23-11b80614bb73> in <module>()
      2 a = Variable(a, requires_grad=True)
      3 embedding = nn.Embedding(10, 100)
----> 4 output = embedding(a)

/home/zeng/code/tensorfold/lib/python2.7/site-packages/torch/nn/modules/module.pyc in __call__(self, *input, **kwargs)
    200 
    201     def __call__(self, *input, **kwargs):
--> 202         result = self.forward(*input, **kwargs)
    203         for hook in self._forward_hooks.values():
    204             hook_result = hook(self, input, result)

/home/zeng/code/tensorfold/lib/python2.7/site-packages/torch/nn/modules/sparse.pyc in forward(self, input)
     92             padding_idx, self.max_norm, self.norm_type,
     93             self.scale_grad_by_freq, self.sparse
---> 94         )(input, self.weight)
     95 
     96     def __repr__(self):

/home/zeng/code/tensorfold/lib/python2.7/site-packages/torch/nn/_functions/thnn/sparse.pyc in forward(self, indices, weight)
     42     def forward(self, indices, weight):
     43         assert indices.dim() <= 2
---> 44         assert not self.needs_input_grad[0], "Embedding doesn't " \
     45             "compute the gradient w.r.t. the indices"
     46 

AssertionError: Embedding doesn't compute the gradient w.r.t. the indices

albanD · March 15, 2017, 10:05am

Hi,

What you give as input to an embedding layer is the index of the element you want to embed and it returns the embedding for this element.
This operation is differentiable wrt the content of the embedding but not the index, thus you cannot give it an index for which you want the gradients (in your code snipet, you set requires_grad=True for a).

If you are just looking for a Linear transformation of the input tensor a, you want to use the nn.Linear layer not nn.Embedding

cdjhz · November 24, 2017, 12:32pm

Then, how to get some of the embeddings updated and some fixed?

albanD · November 29, 2017, 9:41am

If you want to achieve that, I think you will need to set the gradients corresponding to these embeddings to 0 manually after performing the backward pass.

Janinanu · January 5, 2018, 10:26pm

I am having exactly the same problem. I do understand why the problem occurs (surely it makes no sense to compute the gradient w.r.t. the indices).
But I do not understand how to adjust the code so that it computes the gradients w.r.t. the content values of the embedding vectors.

Could someone explain this?
Thanks

albanD · January 8, 2018, 9:58am

Hi,

When an autograd function receives an input with requires_grad=True, that mean that it will need to compute the gradients for this input.
If the function will be unable to compute these gradients (Embedding wrt the indices in this case), it will raise an error if you pass indices with requires_grad=True.
To fix this error, you just need to pass indices with requires_grad=False either by computing them using only Variables that do not require gradients or if you learn your indices in another way, use new_ind = ind.detach(). By doing so, new_ind will have requires_grad=False and no gradients will be propagated.

The Embedding layer contains internally the embedding vectors (called weights) created here. Parameter can be considered here as a simple Variable with requires_grad=True. So the gradients wrt the embedding vectors will always be computed.
Indeed, you can check that even if the input (indices) of your embedding layer has requires_grad=False, the output will have requires_grad=True (because gradients are needed for the embedding vectors.

Janinanu · January 9, 2018, 4:42pm

Thank you very much! After some adjustments, my code now works.

SHAILESH_KUMAR · March 20, 2018, 11:52am

@Janinau can you specify what were your adjustments?