Inconsistency when unpacking tensor as tuple

jhultman · March 1, 2019, 4:10pm

Hi, I am seeing a strange behavior related to tensor unpacking and leaf variables being used for in-place ops. Please see my minimal reproducible example below. I would have thought that make_xy_v1 and make_xy_v2 are equivalent creation patterns, but they are giving rise to different behavior. Can someone explain why these are different? My torch.__version__ is ‘1.0.1.post2’, installed via conda. I am running Ubuntu 16.04. Thanks!

import torch

def make_xy_v1(N):
    x = torch.zeros((N,))
    y = torch.zeros((N,))
    return x, y

def make_xy_v2(N):
    x, y = torch.zeros((2, N))
    return x, y

def demo(make_xy_fn):
    x, y = make_xy_fn(N=5)
    
    inds = [0]
    vals = torch.tensor((1.,), requires_grad=True)

    x[inds] = vals
    y[inds] = vals

demo(make_xy_v1)
">>> Executes fine."

demo(make_xy_v2)
">>> RuntimeError: a leaf Variable that requires grad has been used in an in-place operation."

Note: If I cast inds to tuple, it seems to trigger a different indexing strategy and no error is thrown.

albanD · March 1, 2019, 4:55pm

Hi,

When you do:

x, y = torch.zeros((2, N))

x and y are actually only views of the original big Tensor.
And so modifying x or y inplace changes the big tensor inplace.
The first line modifies x inplace and makes the big tensor require gradients as vals require gradients.
The second line modifies y inplace and thus the big tensor that is now needed and has been modifies inplace.

jhultman · March 1, 2019, 5:44pm

Thanks! I understand now. Do you know why replacing inds = [0,] with inds = (0,) in the above example prevents the error from occurring?

albanD · March 1, 2019, 5:46pm

I guess one is a slice of a single element and the other is just one index.
The tuple is indexing, and the list is slicing.

jhultman · March 1, 2019, 6:18pm

Yes, you are right. I just tried the following and it seems weird to me from a design POV.

y = torch.tensor([0., 1., 2.], requires_grad=True)
y[(1,)] = 0. is allowed.
y[[1,]] = 0. throws “RuntimeError: a leaf Variable that requires grad has been used in an in-place operation.”

Is there good reason for this? They are both in-place modifying a leaf variable requiring grad.

albanD · March 1, 2019, 6:22pm

The tuple indexing means that we know exactly which part of the Tensor you’re using: this one entry.
The list slicing can use arbitrary part of the Tensor as the slicing can be given any number of indices.

In your particular case, I agree that the result is the same but the autograd engine is over-restrictive. If he doesn’t know for sure that something is valid, it will raise an error instead of having something possibly wrong.

jhultman · March 1, 2019, 6:24pm

Got it, thanks! Sorry for all the questions!

albanD · March 1, 2019, 6:40pm

No problem, happy to help !

MariosOreo · March 2, 2019, 2:24am

Hi,

Thanks for you explanation, but the 'arbitrary part ’ doesn’t make sense to me. In my shallow view, return for indexing and slicing are both related to the ‘index’ and ‘slicing index’ we give, so what does the ‘arbitrary’ means?

Thanks in advance

albanD · March 4, 2019, 10:08am

Hi,

It means that when slicing, you can ask for indices 1 3 4 for example, which cannot be represented by only a different view of the original Tensor. And so the exact same Tensor must be modified inplace directly.
If you only ask for just 1, then you can return a new Tensor that share storage with the original.
So the two will have different behaviours.