Inplace Fill with Index Tensor

I want to fill a data tensor with a float given some index tensor. It seems very simple and almost like masked_fill_, but my index tensor doesn’t have the same size as the data tensor. Can you please take a quick look at it? I am looking for something like x.index_fill_(index, 42) in the following example:

x = torch.Tensor([[0, 1, 2, 3, 4],
                  [ 5, 6, 7, 8, 9],
                  [10, 11, 12, 13, 14]])
index = torch.LongTensor([[1, 3], [0, 4], [2, 3]])
# After indexed fill
x = torch.Tensor([[0, 42, 2, 42, 4],
                  [42, 6, 7, 8, 42],
                  [10, 11, 42, 42, 14]])
2 Likes

I think you want scatter_:

x = torch.tensor([[0, 1, 2, 3, 4],
                  [ 5, 6, 7, 8, 9],
                  [10, 11, 12, 13, 14]], dtype=torch.float)
index = torch.tensor([[1, 3], [0, 4], [2, 3]])
x.scatter_(1, index, 42)
print (x == torch.tensor([[0, 42, 2, 42, 4],
                  [42, 6, 7, 8, 42],
                  [10, 11, 42, 42, 14]], dtype=torch.float))

For more complex tasks, you could also build an index for the first dimension and use indexing:

x = torch.tensor([[0, 1, 2, 3, 4],
                  [ 5, 6, 7, 8, 9],
                  [10, 11, 12, 13, 14]], dtype=torch.float)
index = torch.tensor([[1, 3], [0, 4], [2, 3]])
index0 = torch.arange(3, dtype=torch.long)[:, None].expand(-1, 2)
x[index2.reshape(-1), index.view(-1)] = 42
print (x == torch.tensor([[0, 42, 2, 42, 4],
                  [42, 6, 7, 8, 42],
                  [10, 11, 42, 42, 14]], dtype=torch.float))

Best regards

Thomas

5 Likes

In your example if x requires grad and I change the value by slicing (for example x[0, :] = torch.zeros((5))), is it still differentiable? x requires grad but zeros tensor does not in my case.

The answer here is “it depends”, and it is a bit subtle.

  • The good news is yes. We can do
    a = torch.randn(5,5, requires_grad=True)
    b = a+1
    b[0] = torch.zeros(5)
    b.tanh().sum().backward()
    
    and it’ll work!
  • The bad news is there are a number of caveats:
    • You cannot do this on a leaf tensor (e.g. a above) or you get an error that the leaf was pulled into the interior of the graph. So you may need to .clone() the leaf. That said, if you need it for leaves, doing the zeroing in with torch.no_grad(): and manipulating a.grad after the backward might be a more efficient alternative.
    • This works in the above example because the original b is not needed to compute the backward. If you replace b = a + 1 with b = a.tanh() you will get the famous one of the variables needed for gradient computation has been modified by an inplace operation error message because a.tanh() actually wants to use the result to compute the derivative (which is 1-result**2). Again, you could do b = a.tanh().clone() to avoid this.

An alternative (but also with copying) is to not modify the things inplace that require grad, but set up a new tensor (without requires grad) and copy in the relevant bits.

b = torch.zeros_like(a)
b[1:] = a[1:]

This also works when copying (non-overlapping) bits from several tensors.

To me, the autograd handling of subtensor assignments is one of the most intriguing - but also subtle - things in autograd.

Best regards

Thomas

1 Like

Thanks for detailed solution…probably I understand autograd a bit better now. As my tensor was not leaf tensor it is working fine without copying

pytorch tensors now have an index_fill_ method.