Hello, I’m implementing perforated convolution for use during training; (essentially, strided convolution with very specific upscaling to restore output size) With python i can (poorly) simulate the algorithm with strided and transpose conv2d, but i want both performance and exact control over what the algorithm is doing, which is even more important during backward pass.
As of right now, i want to focus on cpu-only implementation, as the control over what is happening is more important than raw compute performance.
I have an issue as i have never used the c++ extensions, and don’t exactly know how to go about doing it. The tutorial extension for LLTM seems to work, copying the ConvolutionMM2d.cpp and binding the forward/backward functions does not.
JIT compiling fails with errors such as:
In function ‘at::Tensor& at::native::thnn_conv2d_out(const at::Tensor&, const at::Tensor&, at::IntArrayRef, const std::optional<at::Tensor>&, at::IntArrayRef, at::IntArrayRef, at::Tensor&)’: /mnt/c/Users/timotej/pytorch/test.cpp:611:26: error: ‘thnn_conv2d_forward_out’ is not a member of ‘at’; did you mean ‘slow_conv3d_forward_out’? 611 | return std::get<0>(at::thnn_conv2d_forward_out(output, finput, fgrad_input, self, weight, kernel_size, bias, stride, padding)); | ^~~~~~~~~~~~~~~~~~~~~~~ | slow_conv3d_forward_out
or
error: cannot convert ‘at::Tensor’ to ‘c10::ScalarType’ in argument passing
So, i suppose i am asking the following:
- am i doing something fundamentally wrong in making this an extension?
- general advice for extensions
- is the above MM2d file (and then going into unfolding the tensor) the best way of making “custom” convolution operation?