Distribution of pytorch extension


I implemented a new operator in cuda and would like to make it available it on pip/conda. What is the best way to do it?

  • Ideally I would build a wheel but I’ve met a lot of issues generating something that is remotely portable from an environment/machine to another. Another issue I don’t know how to solve is how to automatically pick the wheel based on the pytorch version the user has.
  • If users compile when the package is installed then they need nvcc which is not installed with pytorch using the recommended installation commands. Moreover it seems that the GCC version that ships with conda isn’t compatible with cuda 11.7 so it seems I would need users to have a custom combination of GCC/cuda installed outside of conda.

Am I missing something? What is the recommended way of distributing pytorch extensions ?

Thanks for the help

Related to the wheel for example a wheel compiled on another machine with the same version of python, the same major version of pytorch yield to this error when imported:

site-packages/fast_jl.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC2ENS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

(which is a symbol related to pytorch)

Creating binaries is not trivial if you are shipping C++/CUDA code and you could take a look at our pytorch/builder repository to see how the general process is done for PyTorch.
Ideally, you would ship your package with a statically linked CUDA runtime library and would then link against PyTorch and its dependencies.

So statically link against CUDA and dynamic link against pytorch ? What if pytorch upgrades ?

Do you have an example of such thing ?

I can’t even have it compile reliably at install time with GCC 11.3 and the latest version of pytorch it fails with this issue:

It works with older version of GCC but since 11.3 is the default shipped with ubuntu 22 it will be a pain for users to have to downgrade their compiler to install a python package.

Do you think it would be simpler to ditch the interface with pytorch and maybe use cupy or pycuda ?

Statically linking the CUDA runtime is the right approach. I don’t know which additional CUDA libraries you would need.

In that case your package might need to be updated as well assuming it depends on PyTorch directly. If you don’t use any PyTorch operations, your updates would be more flexible.
You can also take a look at e.g. pytorch_geometric which should have a similar release process you are targeting.

I don’t fully understand why users would need to build your package if you are planning to release wheels.

I think it depends on your use case and what your package is supposed to do.
If you don’t have any dependency on PyTorch and are not using any of its operations, you might not want to add it as a dependency and e.g. using pycuda might be easier.