Given a categorical feature map, for example, mat (batchxnclassesxHxW), I’d like encode it to one-hoe format. I know one way is to using scatter_. My question is that this kind of operation is autograd supported or not?

Hi,
It does support autograd, but can compute gradients only wrt the input tensor and not the indices (as the gradients wrt the indices does not exist).
This code snippet should make it clear:

import torch
from torch.autograd import Variable
inp = Variable(torch.zeros(10), requires_grad=True)
# You need to set requires_grad=False because scatter does not give gradient wrt to indices
indices = Variable(torch.Tensor([2, 5]).long(), requires_grad=False)
# We need this otherwise we would modify a leaf Variable inplace
inp_clone = inp.clone()
inp_clone.scatter_(0, indices, 1)
inp_clone.sum().backward()
# So the values that are not modified by scatter have a 1 gradient
# The values changed by scatter have a 0 gradient as they were overwritten by the scatter
print(inp.grad)

I’m wondering how can I know whether it support autograd or not? For example, torch.dot doesn’t support autograd, but torch.mm support. I am not sure what’s the rule to decide whether the operation can support autograd or not?

Unfortunately, right now (this will change in the near future), a Variable can only contain a Tensor and not directly a number. To get around this, the function will return a Variable containing a Tensor with one element instead of a Variable containing just a number. See below:

import torch
from torch.autograd import Variable
a = torch.rand(10)
print("Operating on Tensor")
print(torch.dot(a, a))
v_a = Variable(a)
print("Operating on Variable")
print(torch.dot(v_a, v_a))

if we operation on a tensor (without Variable), torch.dot returns a float.
if we operation on a Variable tensor, torch.dot returns a variable tensor which contains one elements.

Then I think the office specification should better make it more clear. I like pytorch a lot, but I think some part of the official specification is not quite clear.

It is currently work in progress to make Variable being able to contain both a Tensor or a python number.
When this is out, this will work as you expect.

Hello, I meet similar problem when I use Tensor.scatter_(), and I hope to get some suggestion from you. I use Tensor.scatter_() as:

cosine = F.linear(F.normalize(input), F.normalize(self.weight))
sine = torch.sqrt(1.0 - torch.pow(cosine, 2))
phi = cosine * self.cos_m - sine * self.sin_m # phi = cos(theta + m)
one_hot = torch.zeros(cosine.size(), device='cuda')
one_hot.scatter_(1, label.view(-1, 1).long(), 1)
output = (one_hot * phi) + ((1.0 - one_hot) * cosine) # can update without this line
# output = phi # can update
# output = cosine # can update

I want to get a value from phi and other values from cosine, and I found that the parameter self.weight can’t be updated, while self.weight.grad is not all zero but self.weight.grad.sum() is zero. The self.weight can be updated without the last line.
I also tried your advice:

inp = Variable(torch.zeros(10), requires_grad=True)
# You need to set requires_grad=False because scatter does not give gradient wrt to indices
indices = Variable(torch.Tensor([2, 5]).long(), requires_grad=False)
# We need this otherwise we would modify a leaf Variable inplace
inp_clone = inp.clone()
inp_clone.scatter_(0, indices, 1)

But this didn’t work either. This problem really confused me and I hope to get some advice.

This issue is quite old and a lot has changed since. In particular Variables have been removed !
Would you have a small code sample that shows the weights not updating?