Autograd problem with tensors generated by torch.arange

Andrea_A · April 5, 2024, 1:33pm

I want to compute the gradient of a function in several points. However, If I use tensors generated with torch.arange the gradient is not computed (the .grad attributes are None). Instead, using classical tensors it works. Why?

In the following an example

    import torch
    from torch import tensor
    
    def l(w_1,w_2):
        return w_1*w_2
    
    w_1 = tensor(3., requires_grad=True)
    w_2 = tensor(5., requires_grad=True)
    l_v = l(w_1, w_2)
    l_v.backward()
    print(l_v.item(), w_1.grad, w_2.grad) # HERE WORKS OK
    #############
    for w_1_value  in torch.arange(+2,+4,0.1, requires_grad=True):
        for w_2_value  in torch.arange(-2,+4,0.1, requires_grad=True):
            print(w_1_value, w_2_value)
            l_value = l(w_1_value, w_2_value)
            l_value.backward()
            print(l_value.item(), w_1_value.grad, w_2_value.grad) # HERE I GET NONE ON GRAD VALUES

ptrblck · April 5, 2024, 2:53pm

The for loop creates non-leaf variables with a valid grad_fn seen as UnbindBackward0.
Create leaf variables inside the loop and it will work:

        w_1_value = w_1_value.detach().requires_grad_()
        w_2_value = w_2_value.detach().requires_grad_()

You are also seeing a warning explaining it:

UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations.

Andrea_A · April 5, 2024, 3:08pm

Thank you for your answer, but I still don’t understand. What is UnbindBackward and why is it a problem when connected to my custom function graph?
Thanks again for your help

ptrblck · April 5, 2024, 7:30pm

The for loop performs an unbind operations in the same way as seen here:

x = torch.arange(+2,+4,0.1, requires_grad=True)
a, *b = x
print(a)
# tensor(2., grad_fn=<UnbindBackward0>)

This operation is differentiable and thus the returned tensors are non-leaf tensors:

print(a.is_leaf)
# False

Andrea_A · April 9, 2024, 2:32pm

ok, thanks! In the meanwhile, I made the same question to stackoverflow, but the answer provided seems different (python - pytorch: autograd with tensors generated by arange - Stack Overflow) . Is it correct for you?

ptrblck · April 9, 2024, 8:02pm

The answer is pointing towards the same issue.

When you iterate over the items in the torch.arange tensors, you are getting a view of the tensor, which doesn’t have the grad.

Yes, you could also call it a view even if technically it’s an unbind operation. The root cause is still the same: the returned tensors in the for loop are non-leaf variables.