[ATen] Change tensor ordering to column-major to use standard CUDA functions

I’m using ATen and cpp-extensions with PyTorch to work with sparse matrices on GPU.
But I have faced the problem: CUDA functions assume matrices to have fortran-style ordering, when PyTorch (ATen) tensors are stored in C-style.
Could you tell me how can I change PyTorch tensor ordering?

How exactly does changing a tensor to column-major help sparse matrices on GPU?

I think you could transpose a matrix: matrix.t().contiguous() and that would change the matrix to column-major, but I’m not sure that’s what you want.

Sorry for resurrecting this old question. But I have a similar one.

Suppose I have a tensor x in the shape B, C, H, W. What is better?

x.permute({2, 3, 0, 1}).clone()




x.t() won’t work on a tensor with 4 dimensions, as <=2 dims are expected so you should remove the {} and use the first approach:

x.permute(2, 3, 0, 1).contiguous()
1 Like

Thank you for pointing that out.

I tried to use the code without the {} and it throw some compilation errors:

error: too many arguments in function call

That is because I am writing in C++, for which the ATen documentation expects an IntArrayRef

at::Tensor at::permute(const at::Tensor &self, at::IntArrayRef dims)

Now I have this additional question:

  • Is there any advantage between using clone() or contiguous()? Would they be equivalent performance-wise in this case?

Many thanks!

Ah OK, I didn’t realize this.

.clone() will just create a copy of the tensor and will not make it contiguous, while .contiguous() will make create a contiguous (in memory) copy of the tensor.

1 Like