Autograd for duplicated index put operation

I’m confused about index put operation when given indices are duplicated. Here is a code snippet.

import torch

torch.manual_seed(0)
mapping = torch.randn((3,3))

a = torch.ones((3), requires_grad=True)
index = torch.tensor([0,1,1,1])

# Version 1
b = a @ mapping
b[index] += 1
print("index put with duplicated indices", b)
loss = b.sum()
loss.backward()
print("a.grad", a.grad)

# Version 2
a.grad = None
b = a @ mapping
b[torch.unique(index)] += 1
print("index put with unique indices", b)
loss = b.sum()
loss.backward()
print("a.grad", a.grad)

The outputs are:

index put with duplicated indices tensor([ 3.5128,  0.4601, -4.2966], grad_fn=<IndexPutBackward0>)
a.grad tensor([-1.5181, -4.0837,  2.1982])
index put with unique indices tensor([ 3.5128,  0.4601, -4.2966], grad_fn=<IndexPutBackward0>)
a.grad tensor([-0.9312, -1.9147,  0.5221])

So the outputs of these two index put operations are identical, while the gradients are different. I’m confused about this phenomenon.

From the docs:

If accumulate is True, the elements in values are added to self. If accumulate is False, the behavior is undefined if indices contain duplicate elements.

Your current code depends on undefined behavior, so use:

a.grad = None
b = a @ mapping
b.index_put_((index,), torch.tensor(1.), accumulate=True)
print("index put with unique indices", b)
loss = b.sum()
loss.backward()
print("a.grad", a.grad)

instead to accumulate the duplicated indices.