Custom kernels for Intel GPUs

Matthias_Moller · December 20, 2024, 8:49pm

Is it possible to write (simple) custom kernels for Intel GPUs similar to the approach for NVIDIA GPUs using CUDA extensions?

I am used to programming in C++ and started looking into sycl and some of the XPU functionality in PyTorch. However, I miss a minimal example showing how basic functions like Tensor.pow() are implemented for XPU backends.

EikanWang · January 2, 2025, 6:06pm

Hi @Matthias_Moller , we are working on this to provide the C++ extension for Intel GPU and we will keep you posted as soon as the PR being landed.

Matthias_Moller · January 2, 2025, 7:14pm

Hi @EikanWang, I am very happy to hear that this is WIP. Is there maybe some part of the LibTorch source code (C++ API) that I could already look at to see how the existing kernels are implemented? As said (maybe in my other post), I am not using the Python API but directly the C++ API, which might make it easier to integrate a few extra kernels.

EikanWang · January 3, 2025, 3:58am

By now, we have not distributed the libtorch for XPU. I think the Intel Extension for PyTorch may be another example to demonstrate how to extend aten operation through SYCL. However, it is sort of heavy to understand the logic. Is the libtorch must-to-have for your case?

EikanWang · January 3, 2025, 3:58am

I noticed your other post and we will update there.

Matthias_Moller · January 3, 2025, 8:06am

Yes, libtorch is crucial as my entire code is written in c++. When installing PyTorch as described in Getting Started on Intel GPU — PyTorch 2.5 documentation I can find the libtorch libraries somewhere in the python installation. Except for my few custom kernels my code runs fine.

EikanWang · January 6, 2025, 12:40pm

xpu: support sycl with torch.utils.cpp_extension APIs by dvrogozh · Pull Request #132945 · pytorch/pytorch FYI