Potential data race on gpu with tensor advanced indexing

When I try to assign the value of tensors c to another tensor a via some index mappings b1,b2,…, bi using advanced indexing, potential data race on GPU is observed when some indexes point to the same entry of a. An example code snippet is posted here:

a = torch.zeros(100,100).cuda()
b = torch.zeros(50,dtype=torch.long).cuda()
c = torch.arange(50).unsqueeze(1).expand(50,100).float().cuda()
a[b]=c

and the corresponding value is as follows:

In [17]: c
Out[17]:
tensor([[ 0., 0., 0., …, 0., 0., 0.],
[ 1., 1., 1., …, 1., 1., 1.],
[ 2., 2., 2., …, 2., 2., 2.],
…,
[47., 47., 47., …, 47., 47., 47.],
[48., 48., 48., …, 48., 48., 48.],
[49., 49., 49., …, 49., 49., 49.]], device=‘cuda:0’)

In [15]: a
Out[15]:
tensor([[35., 35., 35., …, 34., 34., 34.],
[ 0., 0., 0., …, 0., 0., 0.],
[ 0., 0., 0., …, 0., 0., 0.],
…,
[ 0., 0., 0., …, 0., 0., 0.],
[ 0., 0., 0., …, 0., 0., 0.],
[ 0., 0., 0., …, 0., 0., 0.]], device=‘cuda:0’)

In [16]: a[0]
Out[16]:
tensor([35., 35., 35., 35., 35., 35., 35., 35., 35., 35., 35., 35., 35., 35.,
35., 35., 35., 35., 35., 35., 14., 14., 14., 14., 14., 14., 14., 14.,
14., 14., 14., 14., 14., 14., 14., 14., 14., 14., 14., 14., 14., 14.,
14., 14., 14., 14., 14., 14., 14., 14., 14., 14., 14., 14., 14., 14.,
34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34.,
34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34.,
34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34., 34.,
34., 34.], device=‘cuda:0’)

I wonder if there should be some documentation or warnings on this case. The aggregation rule of this implicit tensor scattering on multiple dimensions seems to be undefined.

This is expected behavior and the docs for `index_put_` mention:

If `accumulate` is `True` , the elements in `value` are added to `self` . If accumulate is `False` , the behavior is undefined if indices contain duplicate elements.