Problem about tensor and Variable

jason_wjbian · July 8, 2017, 1:45pm

Hi, I have a problem as follow

v_a = Variable(Tensor([1,2,3]),requires_grad=True)
t_a = v_a.data
t_b = t_a * 2
v_b = Variable(t_b, requires_grad=True)
loss = Crit(v_b, target)
loss.backward()

please let me know, How can I get the grad of v_a
I try to exec a code like this, but failed to calculate the grad of v_a
Note: t_b = t_a * 2 just a example, the real operation is complex, and variable can’t achieve.

hughperkins · July 8, 2017, 2:56pm

The following code runs ok for me:

import torch
from torch import Tensor
from torch.autograd import Variable
from torch.nn import functional as F

target = Variable(torch.LongTensor([2, 1]))

v_a = Variable(Tensor([[1,2,3], [3,4,5]]),requires_grad=True)
v_b = v_a * 2
loss = F.cross_entropy(v_b, target)
loss.backward()

print('v_a.grad.data', v_a.grad.data)

A few points:

Crit doesnt exist, as far as I can tell. replaced by cross_entropy
torch.nn contains functors, but you can find functional equivalents in torch.nn.functional, which I’ve used here
you need to provide everything in mini-batches, so:
– the input to cross_entropy should be two dimensional, (N, C), where N is size of minibatch, and C is numberof classes
– in practice, this meant I had to make your v_a two dimensional too, so it was a minibatch
target, for cross_entropy is a onedimensional LongTesnor, of class labels, one per minibatch example
everything should be Variables. Forget that Tensors exist

dgriff · July 8, 2017, 4:22pm

I feel like that statement can be misleading for people learning. Variables are wrappers for tensors and you should think of Variables as tensors just wrapped with pytorch’s Variable wrapper so you can auto compute gradients

hughperkins · July 8, 2017, 4:24pm

Yes… the flaw with that is, let’s say you feed in a Tensor to a net. You dont need the gradient from that tensor, no backprop to the input required, so logically, doesnt need to be a Variable. Except, not.

dgriff · July 8, 2017, 5:42pm

then you just set requires_grad=False then. Using a tensor should be thought as the same as using Variable(tensor)

jason_wjbian · July 9, 2017, 6:27am

Thanks for your reply, but you ignore this line “t_b = 2 * (v_a.data)” in my code
Now I’m sure this way that I can’t get v_a.grad
so I want to know how to achieve my operation in the condition that can’t jump out of variable. It seems like scatter_nd in tensorflow.

indices = tf.constant([0,3])
updates = tf.constant([0.2,0.6])
scatter = tf.scatter_nd(indices, updates, shape=4)
print scatter
[0.2 , 0 , 0 , 0.6]
as you can see, the index in indices fill the corresponding value in updates.

hughperkins · July 9, 2017, 7:55am

Ah. Then I agree with you. Unless someone else has an idea. But your title and the code are not helping you here. ‘Problem’ is pretty vague.

Recommend changing title to ‘How to obtain gradient from non-user Variables?’

hughperkins · July 11, 2017, 10:35am

Just to bump this to the top, note that I have not answered this question. The question is:

"How to obtain the gradient of a Variable that is not user-created?"

I would like to know too

albanD · July 11, 2017, 10:57am

Hi,

You can’t access it with .grad.
You can use var.register_hook(fn) (doc here). Where fn is a function that will be given as input the gradient of var. You can then use this function to monitor this gradient or store it in a global variable to have it available somewhere else in your program.

hughperkins · July 11, 2017, 11:09am

Ok, I see. thanks!

Off-topic for this thread really (which is about how to solve a problem, rather than about design), but seems like it might be nice to have an api that would work like:

somevar.save_grad(True)
... do backprop here...
... somevar.grad now has some values :-)

save_grad(True) could work behind the scenes by secretely, or not so secretly, calling var.register_hook, to add in an appropriate hook somewere to save the grad, perhaps?

hughperkins · July 13, 2017, 1:51pm

Created PR at https://github.com/pytorch/pytorch/pull/2078 for adding a retain_grad() method, to store the gradient into .grad during backprop