Inconsistency when unpacking tensor as tuple

Hi, I am seeing a strange behavior related to tensor unpacking and leaf variables being used for in-place ops. Please see my minimal reproducible example below. I would have thought that make_xy_v1 and make_xy_v2 are equivalent creation patterns, but they are giving rise to different behavior. Can someone explain why these are different? My torch.__version__ is ‘1.0.1.post2’, installed via conda. I am running Ubuntu 16.04. Thanks!

import torch

def make_xy_v1(N):
    x = torch.zeros((N,))
    y = torch.zeros((N,))
    return x, y

def make_xy_v2(N):
    x, y = torch.zeros((2, N))
    return x, y

def demo(make_xy_fn):
    x, y = make_xy_fn(N=5)
    inds = [0]
    vals = torch.tensor((1.,), requires_grad=True)

    x[inds] = vals
    y[inds] = vals

">>> Executes fine."

">>> RuntimeError: a leaf Variable that requires grad has been used in an in-place operation."

Note: If I cast inds to tuple, it seems to trigger a different indexing strategy and no error is thrown.


When you do:

x, y = torch.zeros((2, N))

x and y are actually only views of the original big Tensor.
And so modifying x or y inplace changes the big tensor inplace.
The first line modifies x inplace and makes the big tensor require gradients as vals require gradients.
The second line modifies y inplace and thus the big tensor that is now needed and has been modifies inplace.

Thanks! I understand now. Do you know why replacing inds = [0,] with inds = (0,) in the above example prevents the error from occurring?

I guess one is a slice of a single element and the other is just one index.
The tuple is indexing, and the list is slicing.

Yes, you are right. I just tried the following and it seems weird to me from a design POV.

y = torch.tensor([0., 1., 2.], requires_grad=True)
y[(1,)] = 0. is allowed.
y[[1,]] = 0. throws “RuntimeError: a leaf Variable that requires grad has been used in an in-place operation.”

Is there good reason for this? They are both in-place modifying a leaf variable requiring grad.

The tuple indexing means that we know exactly which part of the Tensor you’re using: this one entry.
The list slicing can use arbitrary part of the Tensor as the slicing can be given any number of indices.

In your particular case, I agree that the result is the same but the autograd engine is over-restrictive. If he doesn’t know for sure that something is valid, it will raise an error instead of having something possibly wrong.

1 Like

Got it, thanks! Sorry for all the questions! :slight_smile:

No problem, happy to help !


Thanks for you explanation, but the 'arbitrary part ’ doesn’t make sense to me. In my shallow view, return for indexing and slicing are both related to the ‘index’ and ‘slicing index’ we give, so what does the ‘arbitrary’ means?:thinking:

Thanks in advance


It means that when slicing, you can ask for indices 1 3 4 for example, which cannot be represented by only a different view of the original Tensor. And so the exact same Tensor must be modified inplace directly.
If you only ask for just 1, then you can return a new Tensor that share storage with the original.
So the two will have different behaviours.