Torch+TF+Spacy+more+GPU+ Python 3.7+Ubuntu 22.04

I run a 2-year old program from github which only works with Python 3.7 (does not work with Python 3.8, 3.9, 3.10) and uses tensorflow , torch, spacy all with GPU support and many other modules. I was able to run the program ok without GPU. Without GPU hardware, with torch=1.13.1 and TF=2.9.0 it gives warnings that CUDA is not available, but otherwise runs without errors and does produce correct results.

I spent a week trying to make it work with GPU. With Python 3.7, TF is upper-limited to Cuda=11.2 and Cudnn=8.1, yet there is no torch+cu112. It means I have to have two different versions of Cuda at the same time.
conda install -c conda-forge cudatoolkit=11.2.2 cudnn=8.1.0 # for TF and Spacy
pip install spacy[cuda112]
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url
`(py37) $ python -c “import torch; print(f’PyTorch version {torch.version} has CUDA : {torch.cuda.is_available()}')”
PyTorch version 1.12.1+cu113 has CUDA : True’

When I run the program, depending on torch+cuda version, I get various torch errors. For example:

RuntimeError: CUDA out of memory. Tried to allocate 148.00 MiB (GPU 0; 15.71 GiB total capacity; 1.33 GiB already allocated; 50.50 MiB free; 1.45 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

It appears there is 15.71 GiB total capacity of GPU yet torch is not able to allocate 148.00 MiB.
In addition to torch==1.12.1+cu113, I tried other torch+cuda versions. Some of them install ok and appear to recognize my GPU ok, but all fail with various torch errors when I run the program.

I am aware that Python 3.7 is no longer officially supported yet hope to get GPU to work. Is running without GPU my only option?

Collecting environment information…
PyTorch version: 1.12.1+cu113
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.3 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Libc version: glibc-2.10

Python version: 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:21) [GCC 9.4.0] (64-bit runtime)
Python platform: Linux-6.2.0-36-generic-x86_64-with-debian-bookworm-sid
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 4060 Ti
Nvidia driver version: 535.129.03
cuDNN version: Could not collect
Is XNNPACK available: True

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 24
On-line CPU(s) list: 0-23
Vendor ID: GenuineIntel
Model name: 12th Gen Intel(R) Core™ i9-12900K
CPU family: 6
Model: 151
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
Stepping: 2
CPU max MHz: 5200.0000
CPU min MHz: 800.0000
BogoMIPS: 6374.40

Virtualization: VT-x
L1d cache: 640 KiB (16 instances)
L1i cache: 768 KiB (16 instances)
L2 cache: 14 MiB (10 instances)
L3 cache: 30 MiB (1 instance)
NUMA node(s): 1
NUMA node0 CPU(s): 0-23

Versions of relevant libraries:
[pip3] numpy==1.21.6
[pip3] torch==1.12.1+cu113
[pip3] torchaudio==0.12.1+cu113
[pip3] torchvision==0.13.1+cu113
[conda] cudatoolkit 11.2.2 hc23eb0c_12 conda-forge
[conda] numpy 1.21.6 pypi_0 pypi
[conda] torch 1.12.1+cu113 pypi_0 pypi
[conda] torchaudio 0.12.1+cu113 pypi_0 pypi
[conda] torchvision 0.13.1+cu113 pypi_0 pypi

TF will usually try to allocate all device memory and might not leave anything for PyTorch to use. Disable this behavior and check if you would still be running out of memory. If that doesn’t help you could build both frameworks with deprecated the Python version from source.

Is it in principle a good idea to run side by side apps requiring different (multiple) versions of Cuda? Would I be better off if using Cuda 11.0 and Cudnn 8.0 for both TF and torch (and spacy)?
conda install -c conda-forge -c nvidia cudatoolkit=11.0 cudnn=8.0 # for TF and Spacy
pip install spacy[cuda110]

pip install torch==1.*+cu110 torchvision==*+cu110 torchaudio==* --extra-index-url

Which version of torch with Cu110 is most suitable for my setup (Conda+ Python 3.7 +Ubuntu 22.04) ?

No, I would recommend trying to align the stack as much as possible.
I also don’t know how exactly TF uses CUDA and if it depends on e.g. conda binaries or your locally installed CUDA toolkit.

Also no, since CUDA 11.0 is quite old by now and you might be running into known and already fixed issues.

The cleanest way would be to update the used code base to be compatible with a newer Python version and then to install the latest (compatible) binaries.
However, I also don’t know what exactly is used that is not supported in Python>=3.8 and how hard it would be to port it over.

I would like to build PyTorch 1.12.1 and 1.13.1 from source for Cuda=11.2 and Ubuntu 22.04 without installing Nvidia Cuda system-wide.

Can I achieve that in conda environment with conda install -c conda-forge cudatoolkit=11.2.2 cudnn=8.1.0 ?

No, as cudatoolkit should only ship the runtime libraries. You could search for a CUDA compiler conda package instead, but I don’t have any experience with it as I’m using a locally installed CUDA toolkit for my source builds.

I will, too, install CUDA 11.2 locally. How can I properly select PyTorch (and then torchvision) source which is compatible with Python 3.7 ?
I plan to use the following tutorial (but for Cuda 11.2) Build PyTorch from Source with CUDA 11.8 with Ubuntu 22.04 | by Zhanwen Chen | Medium.

I would probably go through release notes or check the binary names to see when Python 3.7 was used as well as an early CUDA 11 version.

On pytorch website
I see precompiled

So Python 3.7 was supported for versions v1.13.1 and v1.12.1.

git clone --recursive _
cd pytorch
git checkout v1.13.1 (or v1.12.1)

I will be compiling in conda environment with Python=3.7.
So, if I specify v1.13.1 (or v1.12.1) in the above checkout, will it be sufficient to get binaries compiled for Python 3.7 with Cuda 11.2?

I guess so, but eventually you should just try building it.