OpenCL backend for pytorch - now easily installables

Hello Dear Pytorch users,

I released a new version of pytorch out of tree OpenCL backend - it allows to train your models on AMD, NVidia and even Intel GPUs on both Windows and Linux - without platform specific mess.

What is new is that I provide now prebuilt whl files for easy installation for pytorch 2.4, See: GitHub - artyom-beilis/pytorch_dlprim: DLPrimitives/OpenCL out of tree backend for pytorch

And how is the performance? While it isn’t as good as native cuda or rocm, depending on network it still competitive result, especially for ROCM: pytorch_dlprim/benchmark.md at master · artyom-beilis/pytorch_dlprim · GitHub

I encourage everybody to test it.

It still has lots of gaps in both functionality and performance - but I think having true open-source and cross platform training backend is very important for the future of deep learing.

1 Like

I released a new version 0.2.0 of the OpenCL backend - including binary whl files for pytorch 2.4

In the nutshell

  • New Intel Arch GPU is now tested and performance improvements added.
  • Visual transformers are now validated and working.
  • Many operators implemented and many fixed

Benchmarks are updated to latest verson

In a short summary comparing OpenCL vs native performance (rocm/cuda/xpu)

Vendor Device % Perf Train % Perf Test compare to
AMD rx6600XT 71% 85% rocm
NVidia gtx960 65% 69% cuda
Intel Arc A380 40% 54% xpu

Unfortunatly my code isn’t optimal enough for Intel, but on the good side I think I can integrate oneDNN quite easily - but this is next step