FX.graph generate CUDA code for customized layers

Is there any way to add a customized CUDA kernel generator based on the output from Torch.FX.graph.
For instance, I define a simple CNN model with two layers, and my goal is to generate CUDA code for a specialized convolutional kernel (let’s say conv_new) to replace the original convolution based on FX.graph IR.

Should I first build a customized operator in Pytorch as an extension and then replace the original model kernel with the new one? or is there any code gen for FX.graph to generate the kernel directory from templates?

I don’t know if PyTorch specifies a standard interface to set certain torch.fx nodes to run as CUDA kernels, but you could try by binding a python function to the kernel, inserting the node and setting the function as its target.

1 Like