I have a image processing package on PyPI and some of the algorithms were implemented with TF/Keras, I decided to migrate these to torch models. I migrated everything and tested torch implementations, all work great.
I specify my dependencies in setup.cfg and currently torch from PyPI comes with CUDA 10.2 support. Some of my users with newer cards and drivers (like driver version 470+) face problems like RuntimeError: CUDA error: no kernel image is available for execution on the device. For these users removing the torch installation and re-installing it with pip install torch --extra-index-url https://download.pytorch.org/whl/cu113 solve this problem.
I also have a frozen application that is made with PyInstaller where I expect to have similar problems.
I know I can always ask users to install torch manually but that’s cumbersome. I want to ask if there are any better/best practices that I might be not aware of? What are the suggested details regarding having torch as a dependency and publish your packages on PyPI with cross-platform compatibility out-of-box as much as possible.
I’m not aware of a proper way to specify these dependencies and think for now asking the user to install the appropriate binary is the way to go.
However, we are right now about to remove the CUDA 10.2 binaries and host the CUDA 11.7.1 binaries on PyPI (using the cuda pip wheels as a dependency) so that these would be installed from the default mirror. We trying to fix the last compatibility issues and are hoping to get something ready by end of this week so that it can make it into the next stable 1.13.0 release.
Let me know, if you get a response there as I would also be interested in these options. I hope we won’t need them soon, but it’s still interesting to learn about these approaches.
I am thinking if uploading these different binaries with different version names to PyPI for pytorch can be a solution? I know you said soon CUDA 11.7.1 binaries will be hosted on PyPI but I am not confident on forward compatibility of the NVIDIA drivers. What do you think about this idea @ptrblck ? If you are positive what I should do next for this discussion? creating an issue maybe?
I see, does that mean torch with CUDA 10.2 significantly smaller than torch with CUDA 11.3? Or is there a storage cap per version on PyPI?
Users with pre-CUDA 11.* supporting drivers previously reported that had runtime issues with the things I built with CUDA 11.3. Using this: CUDA Compatibility :: NVIDIA Data Center GPU Driver Documentation some of them did upgrades and were able to run the programs smoothly. I can ping you whenever another reproducible example comes by if you like.
Yes, CUDA 11.x is larger as it ships with more kernels for new architectures etc. While we could reduce the size a bit, it’s still way too large for us to host the PyTorch pip wheel including the CUDA runtime (in the same wheel) on PyPI. The new approach would be to build the PyTorch wheels with e.g. CUDA 11.7.1 but instead of shipping the wheel with the CUDA runtime, we would add a dependency on the CUDA pip wheels also hosted on PyPI (newly added in 11.7.1), which would reduce the PyTorch wheels significantly.
Yes, please do so. Minor version compatibility should work in all CUDA 11.x versions and we have to fix anything that breaks it.
Note that “minor version compatibility” was added in 11.x. The linked “forward compatibility” is used for Data Center GPUs (previously called Tesla GPUs).
In any case, feel free to ping me if you are seeing any issues and I can take a look.