Wrong value when accessing tensor with pointer

I’m working on some cuda extension for pytorch. When I received the grad_output, printing it yields correct values.

std::cout << upstreamGrad << std::endl;

output 
1 1 1 1 1 1
[ CUDAFloatType{1,6} ]

However, accessing it with a pointer returns wrong values.

std::vector<float> tmp(6);
cudaMemcpy(tmp.data(), upstreamGrad.data_ptr(), 6 * sizeof(float), cudaMemcpyDeviceToHost);
for (int i = 0; i < 6; i++) {
  std::cout << tmp[i] << std::endl;
}

output
1
20
12
0
27
22

I was wondering if I’m missing something here.

You are not checking the storage_offset and stride informations.
Given that the first value is correct but not the other one, I would guess that this is a non-contiguous Tensor and you should use the stride to read it properly.
Doing upstreamGrad.contiguous().data_ptr() should give you the right result.

1 Like