a = torch.Tensor(10)
a = Variable(a, requires_grad=True)
embedding = nn.Embedding(10, 100)
output = embedding(a)
AssertionError Traceback (most recent call last)
<ipython-input-23-11b80614bb73> in <module>()
2 a = Variable(a, requires_grad=True)
3 embedding = nn.Embedding(10, 100)
----> 4 output = embedding(a)
/home/zeng/code/tensorfold/lib/python2.7/site-packages/torch/nn/modules/module.pyc in __call__(self, *input, **kwargs)
201 def __call__(self, *input, **kwargs):
--> 202 result = self.forward(*input, **kwargs)
203 for hook in self._forward_hooks.values():
204 hook_result = hook(self, input, result)
/home/zeng/code/tensorfold/lib/python2.7/site-packages/torch/nn/modules/sparse.pyc in forward(self, input)
92 padding_idx, self.max_norm, self.norm_type,
93 self.scale_grad_by_freq, self.sparse
---> 94 )(input, self.weight)
96 def __repr__(self):
/home/zeng/code/tensorfold/lib/python2.7/site-packages/torch/nn/_functions/thnn/sparse.pyc in forward(self, indices, weight)
42 def forward(self, indices, weight):
43 assert indices.dim() <= 2
---> 44 assert not self.needs_input_grad, "Embedding doesn't " \
45 "compute the gradient w.r.t. the indices"
AssertionError: Embedding doesn't compute the gradient w.r.t. the indices
What you give as input to an embedding layer is the index of the element you want to embed and it returns the embedding for this element.
This operation is differentiable wrt the content of the embedding but not the index, thus you cannot give it an index for which you want the gradients (in your code snipet, you set
If you are just looking for a Linear transformation of the input tensor
a, you want to use the
nn.Linear layer not
Then, how to get some of the embeddings updated and some fixed?
If you want to achieve that, I think you will need to set the gradients corresponding to these embeddings to 0 manually after performing the backward pass.
I am having exactly the same problem. I do understand why the problem occurs (surely it makes no sense to compute the gradient w.r.t. the indices).
But I do not understand how to adjust the code so that it computes the gradients w.r.t. the content values of the embedding vectors.
Could someone explain this?
When an autograd function receives an input with
requires_grad=True, that mean that it will need to compute the gradients for this input.
If the function will be unable to compute these gradients (
Embedding wrt the indices in this case), it will raise an error if you pass indices with
To fix this error, you just need to pass indices with
requires_grad=False either by computing them using only
Variables that do not require gradients or if you learn your indices in another way, use
new_ind = ind.detach(). By doing so,
new_ind will have
requires_grad=False and no gradients will be propagated.
Embedding layer contains internally the embedding vectors (called weights) created here.
Parameter can be considered here as a simple
requires_grad=True. So the gradients wrt the embedding vectors will always be computed.
Indeed, you can check that even if the input (indices) of your embedding layer has
requires_grad=False, the output will have
requires_grad=True (because gradients are needed for the embedding vectors.
Thank you very much! After some adjustments, my code now works.
@Janinau can you specify what were your adjustments?