Problem with custom extensions and PyTorch installation using Anaconda

MFML · September 16, 2023, 11:10am

I installed PyTorch using conda in a fresh environment on my Manjaro linux OS as described in “Get Started” / “Start Locally”, using the given command:

conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

This worked fine, given that torch.cuda.is_available() returns True and for testing, I successfully trained a model on my GPU (RTX 4090).

However, it looks like I can’t use custom extensions, such as the CUDA kernel provided for swin transformers. If I run the setup.py, I get the error:

OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.

I also recognize that nvcc is not available in the shell.

I’ve read quite a lot of postings across different forums which seem to be related, but couldn’t find a solution for this problem.

It looks like I have to install CUDA separately, but I also need to have the version match with the version of PyTorch. If I install CUDA using the package manager of Manjaro (and not via conda), the versions don’t match and it doesn’t work.

How can I install both PyTorch and CUDA with conda to make custom extensions work?

ptrblck · September 16, 2023, 5:30pm

Yes, that’s correct and you should install a local CUDA toolkit, including the nvcc CUDA compiler, to be able to build custom CUDA extensions.

You could experiment with the cuda package on conda as it would also ship with nvcc, but I have never used it and don’t know how easy it would be to use compared to just installing CUDA locally.

MFML · September 18, 2023, 11:37am

Thanks for your answer!

It turned out that installing CUDA in conda worked fine. Just intalled the CUDA package corresponding to the required version number (11.8 in my case), and additionally installed the gcc_linux-64 and gxx_linux-64 packages, which where needed because the locally installed GCC version wasn’t compatible.

I successfully ran the setup.py script from swin transformer and used the compiled functions.