Non deterministic behaviour when indexing tensor with meshgrid

Hi,

I am trying to avoid a nested for loop by creating a meshgrid of the two loop variables. However doing the same assignment twice using the grids gives me different results:

Here is the code with the assignment done with a nested for loop:

var1 = torch.zeros_like(vertex[bi]) # shape (126,224,9,2), dtype=torch.float32
var2 = torch.zeros_like(vertex[bi]) # shape (126,224,9,2), dtype=torch.float32

# dl_dv0 and dl_dv1 have shape (rn,vn,2) = (256, 9, 2)
# hw_index has shape (rn,vn,2) and contains indices

for r in range(rn):
  for k in range(vn):
        var1[hw_index_0[r,k,0],hw_index_0[r,k,1],k,:] += dl_dv0[r,k,:]
        var1[hw_index_1[r,k,0],hw_index_1[r,k,1],k,:] += dl_dv1[r,k,:]
    
        var2[hw_index_0[r,k,0],hw_index_0[r,k,1],k,:] += dl_dv0[r,k,:]
        var2[hw_index_1[r,k,0],hw_index_1[r,k,1],k,:] += dl_dv1[r,k,:]

diff0 = (var1 - var2).abs()
print(diff0.mean(), diff0.max())

>>>  tensor(0., device='cuda:0') tensor(0., device='cuda:0')

The difference is 0 as expected since the two variables receive the two assignments.

If I try to do the same with meshgrid I get:

all_ks = torch.LongTensor([i for i in range(vn)])
all_rs = torch.LongTensor([i for i in range(rn)])
grid_r, grid_k = torch.meshgrid(all_rs, all_ks)

var1 = torch.zeros_like(vertex[bi]) # shape (126,224,9,2), dtype=torch.float32
var2 = torch.zeros_like(vertex[bi]) # shape (126,224,9,2), dtype=torch.float32

# dl_dv0 and dl_dv1 have shape (rn,vn,2) = (256, 9, 2)
# hw_index has shape (rn,vn,2) and contains indices

var1[hw_index_0[grid_r,grid_k,0],hw_index_0[grid_r,grid_k,1],grid_k,:] += dl_dv0[grid_r,grid_k,:]
var1[hw_index_1[grid_r,grid_k,0],hw_index_1[grid_r,grid_k,1],grid_k,:] += dl_dv1[grid_r,grid_k,:]

var2[hw_index_0[grid_r,grid_k,0],hw_index_0[grid_r,grid_k,1],grid_k,:] += dl_dv0[grid_r,grid_k,:]
var2[hw_index_1[grid_r,grid_k,0],hw_index_1[grid_r,grid_k,1],grid_k,:] += dl_dv1[grid_r,grid_k,:]

diff0 = (var1 - var2).abs()
print(diff0.mean(), diff0.max())

>>> tensor(2.3726e-06, device='cuda:0') tensor(0.6372, device='cuda:0')

As you can see the same assignments produce two different variables. Am I misunderstanding how to use meshgrids? Is it possible to use them to achieve what I’m trying to do? I tried searching on google for something similar with no success.

I am using torch 1.7.1 and can’t update due to external constraints.