How to define a leaf tensor in PyTorch 0.4.1

igreen · October 31, 2018, 4:49pm

Test environment: pytorch 0.4.1

Case 1:
1, torch.zeros([1,2], dtype=torch.float, requires_grad=True).is_leaf
True
2, torch.zeros([1,2], dtype=torch.float, requires_grad=True).cuda().is_leaf
False
3, torch.zeros([1,2], dtype=torch.float).cuda().is_leaf
True

Case 2:
y = torch.zeros([1,2]) # is_leaf = True
y = y.cuda() # is_leaf = True
y.requires_grad = True # is_leaf = True

Seems that the 2 of Case 1 is not same with Case 2!!! Why?

albanD · October 31, 2018, 5:55pm

Hi,

It’s because Case 2 is the same as the 3 of Case 1, you don’t do any operation on y when it requires gradients, so it’s always a leaf

igreen · October 31, 2018, 6:39pm

But i cannot understand the appearance of 2 of Case 1… In fact, I think that the difference is : for 2 of Case 1, i set requires_grad = True when creation. for Case 2, i set requires_grad = True after creation, why these two cases are not same?

colesbury · October 31, 2018, 6:51pm

The ordering of require_grad and the cuda() call matters. After requires_grad = True, any operation (such as “x.cuda()” or “x * 2” is recorded and treated as a differentiable operation.

Here’s another example to make this more clear. You could substitute .cuda() for * 2.

>>> x = torch.tensor([1.0], requires_grad=True)
>>> y = x * 2  # is_leaf = False
>>> print(y)
tensor([2.], grad_fn=<MulBackward0>)

>>> x = torch.tensor([1.0])
>>> y  = x * 2
>>> print(y)
tensor([2.])
>>> y.requires_grad = True  # is_leaf = True
>>> print(y)
tensor([2.], requires_grad=True)

In the first case, y is treated as a function of x since x.requires_grad=True. You can see it has a grad_fn so is_leaf is False

In the second case, y is treated as independent of x because x did not have requires_grad=True. Because it’s independent of x, it doesn’t have a grad_fn so is_leaf is True.

igreen · October 31, 2018, 7:05pm

Thanks a lot. I get it. I found a small difference for 0.3.1 and 0.4.1

0.3.1:

a = Variable(torch.randn(3,3)).cuda()
a.is_leaf
False

0.4.1:

a = Variable(torch.randn(3,3)).cuda()
a.is_leaf
True
Seem that it is more reasonable because the default value of requires_grad of Variable is False.