Using THCudaBlas_Sgemm with torch::Tensor?

cmauceri · May 22, 2020, 4:14pm

I’m updating an old cuda extention. The old code used THCudaTensor and THCudaBlas_Sgemm. The updated code uses torch::Tensor, but I’m not sure how to correspondingly update THCudaBlas_Sgemm.

Original call

THCudaBlas_Sgemm(state, 'n', 'n', n, m, k, 1.0f,
                     THCudaTensor_data(state, columns), n,
                     THCudaTensor_data(state, weight), k, 1.0f,
                     THCudaTensor_data(state, output_n), n);

I’ve tried

THCudaBlas_Sgemm(state, 'n', 'n', n, m, k, 1.0f,
                     columns.data(), n,
                     weight.data(), k, 1.0f,
                     output_n.data(), n);

But I get the error

error: cannot convert ‘at::Tensor’ to ‘float*’ for argument ‘8’ to ‘void THCudaBlas_Sgemm(THCState*, char, char, int64_t, int64_t, int64_t, float, float*, int64_t, float*, int64_t, float, float*, int64_t)’

I’ve also thought about how to convert this call to torch::addmm, but the leading dimension arguments in sgemm make this a complicated conversion since there isn’t an equivalent argument to addmm.

ptrblck · May 24, 2020, 8:02am

THCTensor_ might work, but wouldn’t it be easier to use torch.matmul directly?

cmauceri · May 26, 2020, 3:40pm

Does torch.matmul have arguments similar to LDA (leading dimension A) and LDB (leading dimension B)?

I’ve been working out some slicing solutions to get the same result, but this old code is poorly documented and I’d like to stay as close as possible to the original to avoid introducing bugs where I don’t understand the original intent.

ptrblck · May 27, 2020, 3:50am

No, you would have to pass the transposed matrix if you would like to change the leading dimensions.

JamesDickens · November 1, 2020, 11:46pm

@cmauceri @ptrblck do you have a link to any documentation for THCudaBlas_Sgemm, I cannot find it. I notice this is code from the Depth Aware Segementation for RGBD images paper, I too am trying to port this to pytorch 1.7 and write my own c++ extension and it is difficult to find documentation for the older functions used.

ptrblck · November 2, 2020, 6:49am

This method dispatches to at::cuda::blas::gemm and calls cublas routines such as cublasGemmEx (docs). I think it might be easier and cleaner to use torch:: methods instead of calling into these backends directly, but I’m also unfamiliar with the implementation.

Looottch · March 3, 2021, 4:26pm

I’m also doing that part now, I directy use torch:: methods but I’m not sure will that effect the final results. May I ask have you get the exact same result with older version.

cmauceri · March 4, 2021, 2:48pm

@JamesDickens @Looottch I have a completed port of DepthAwareCNN at GitHub - crmauceri/DepthAwareCNN-pytorch1.5: Depth-aware CNN for RGB-D Segmentation, ECCV 2018

Looottch · March 4, 2021, 3:47pm

Thanks, I go through depthconv_cuda.cpp and not find function like “depthconv_backward_parameters_cuda”, you not convert that func or I miss it somewhere.

cmauceri · March 4, 2021, 4:38pm

I tried to do a one-to-one conversion, but in the end, I did a re-implementation instead based on the paper. Therefore some of the functions might be different.