I implemented a new operator in cuda and would like to make it available it on pip/conda. What is the best way to do it?
Ideally I would build a wheel but I’ve met a lot of issues generating something that is remotely portable from an environment/machine to another. Another issue I don’t know how to solve is how to automatically pick the wheel based on the pytorch version the user has.
If users compile when the package is installed then they need nvcc which is not installed with pytorch using the recommended installation commands. Moreover it seems that the GCC version that ships with conda isn’t compatible with cuda 11.7 so it seems I would need users to have a custom combination of GCC/cuda installed outside of conda.
Am I missing something? What is the recommended way of distributing pytorch extensions ?
Creating binaries is not trivial if you are shipping C++/CUDA code and you could take a look at our pytorch/builder repository to see how the general process is done for PyTorch.
Ideally, you would ship your package with a statically linked CUDA runtime library and would then link against PyTorch and its dependencies.
Statically linking the CUDA runtime is the right approach. I don’t know which additional CUDA libraries you would need.
In that case your package might need to be updated as well assuming it depends on PyTorch directly. If you don’t use any PyTorch operations, your updates would be more flexible.
You can also take a look at e.g. pytorch_geometric which should have a similar release process you are targeting.
I don’t fully understand why users would need to build your package if you are planning to release wheels.
I think it depends on your use case and what your package is supposed to do.
If you don’t have any dependency on PyTorch and are not using any of its operations, you might not want to add it as a dependency and e.g. using pycuda might be easier.