Problem about tensor and Variable

Hi, I have a problem as follow

v_a = Variable(Tensor([1,2,3]),requires_grad=True)
t_a = v_a.data
t_b = t_a * 2
v_b = Variable(t_b, requires_grad=True)
loss = Crit(v_b, target)
loss.backward()

please let me know, How can I get the grad of v_a
I try to exec a code like this, but failed to calculate the grad of v_a
Note: t_b = t_a * 2 just a example, the real operation is complex, and variable can’t achieve.

The following code runs ok for me:

import torch
from torch import Tensor
from torch.autograd import Variable
from torch.nn import functional as F

target = Variable(torch.LongTensor([2, 1]))

v_a = Variable(Tensor([[1,2,3], [3,4,5]]),requires_grad=True)
v_b = v_a * 2
loss = F.cross_entropy(v_b, target)
loss.backward()

print('v_a.grad.data', v_a.grad.data)

A few points:

  • Crit doesnt exist, as far as I can tell. replaced by cross_entropy
  • torch.nn contains functors, but you can find functional equivalents in torch.nn.functional, which I’ve used here
  • you need to provide everything in mini-batches, so:
    – the input to cross_entropy should be two dimensional, (N, C), where N is size of minibatch, and C is numberof classes
    – in practice, this meant I had to make your v_a two dimensional too, so it was a minibatch
  • target, for cross_entropy is a onedimensional LongTesnor, of class labels, one per minibatch example
  • everything should be Variables. Forget that Tensors exist :slight_smile:

I feel like that statement can be misleading for people learning. Variables are wrappers for tensors and you should think of Variables as tensors just wrapped with pytorch’s Variable wrapper so you can auto compute gradients :slight_smile:

Yes… the flaw with that is, let’s say you feed in a Tensor to a net. You dont need the gradient from that tensor, no backprop to the input required, so logically, doesnt need to be a Variable. Except, not.

then you just set requires_grad=False then. Using a tensor should be thought as the same as using Variable(tensor)

Thanks for your reply, but you ignore this line “t_b = 2 * (v_a.data)” in my code
Now I’m sure this way that I can’t get v_a.grad
so I want to know how to achieve my operation in the condition that can’t jump out of variable. It seems like scatter_nd in tensorflow.

indices = tf.constant([0,3])
updates = tf.constant([0.2,0.6])
scatter = tf.scatter_nd(indices, updates, shape=4)
print scatter
[0.2 , 0 , 0 , 0.6]
as you can see, the index in indices fill the corresponding value in updates.

Ah. Then I agree with you. Unless someone else has an idea. But your title and the code are not helping you here. ‘Problem’ is pretty vague.

Recommend changing title to ‘How to obtain gradient from non-user Variables?’

Just to bump this to the top, note that I have not answered this question. The question is:

"How to obtain the gradient of a Variable that is not user-created?"

I would like to know too :slight_smile:

Hi,

You can’t access it with .grad.
You can use var.register_hook(fn) (doc here). Where fn is a function that will be given as input the gradient of var. You can then use this function to monitor this gradient or store it in a global variable to have it available somewhere else in your program.

1 Like

Ok, I see. thanks!

Off-topic for this thread really (which is about how to solve a problem, rather than about design), but seems like it might be nice to have an api that would work like:

somevar.save_grad(True)
... do backprop here...
... somevar.grad now has some values :-)

save_grad(True) could work behind the scenes by secretely, or not so secretly, calling var.register_hook, to add in an appropriate hook somewere to save the grad, perhaps?

Created PR at https://github.com/pytorch/pytorch/pull/2078 for adding a retain_grad() method, to store the gradient into .grad during backprop