Potential data race on gpu with tensor advanced indexing

When I try to assign the value of tensors c to another tensor a via some index mappings b1,b2,…, bi using advanced indexing, potential data race on GPU is observed when some indexes point to the same entry of a. An example code snippet is posted here:

a = torch.zeros(100,100).cuda()
b = torch.zeros(50,dtype=torch.long).cuda()
c = torch.arange(50).unsqueeze(1).expand(50,100).float().cuda()
a[b]=c

and the corresponding value is as follows:

In [17]: c
Out[17]:
tensor([[ 0., 0., 0., …, 0., 0., 0.],
[ 1., 1., 1., …, 1., 1., 1.],
[ 2., 2., 2., …, 2., 2., 2.],
…,
[47., 47., 47., …, 47., 47., 47.],
[48., 48., 48., …, 48., 48., 48.],
[49., 49., 49., …, 49., 49., 49.]], device=‘cuda:0’)

In [15]: a
Out[15]:
tensor([[35., 35., 35., …, 34., 34., 34.],
[ 0., 0., 0., …, 0., 0., 0.],
[ 0., 0., 0., …, 0., 0., 0.],
…,
[ 0., 0., 0., …, 0., 0., 0.],
[ 0., 0., 0., …, 0., 0., 0.],
[ 0., 0., 0., …, 0., 0., 0.]], device=‘cuda:0’)

In [16]: a[0]
Out[16]:
tensor([35., 35., 35., 35., 35., 35., 35., 35., 35., 35., 35., 35., 35., 35.,
35., 35., 35., 35., 35., 35., 14., 14., 14., 14., 14., 14., 14., 14.,
14., 14., 14., 14., 14., 14., 14., 14., 14., 14., 14., 14., 14., 14.,
14., 14., 14., 14., 14., 14., 14., 14., 14., 14., 14., 14., 14., 14.,
34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34.,
34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34.,
34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34.,
34., 34.], device=‘cuda:0’)

I wonder if there should be some documentation or warnings on this case. The aggregation rule of this implicit tensor scattering on multiple dimensions seems to be undefined.

This is expected behavior and the docs for index_put_ mention:

If accumulate is True , the elements in value are added to self . If accumulate is False , the behavior is undefined if indices contain duplicate elements.