Hi, I have my customized CPU and CUDA kernel, but I want to run them on my MacBook Pro GPU. I wonder if there are some tutorials to write the customized kernel on MPS backend, especially how to load the customized op in PyTorch?
I don’t think we have a tutorial for that yet no.
You should be able to write custom Metal kernels though within PyTorch.
I’m not sure if the cpp_extension module works fine for these though. You should feel free to try it out and open issues on github if you encounter any problem with it!