Can I compile a cuda compatible version of pytorch on a machine with no GPUs available?
I have access to a computer with high CPU count but no GPUs and was wanting to leverage it to compile PyTorch faster. I am compiling within an nvidia docker container which I am then using on my machine that has GPUs. nvidia-smi is returning the gpus properly nvcc seems to be working properly
But when I run torch.cuda.is_available() in python I get:
/opt/conda/envs/computer_vision/lib/python3.8/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at /opt/pytorch/c10/cuda/CUDAFunctions.cpp:100.)
return torch._C._cuda_getDeviceCount() > 0
It should be possible to use nvcc without a GPU.
The error message might be raised, if your NVIDIA driver or local CUDA toolkit isn’t properly installed or found. Did you execute the docker container via nvidia-docker or with the --gpus=all option?
I am using driver 460.32.03 with the nvidia-cuda container cuda:11.2.1-cudnn8-devel and I installed magma-cuda112 could it be that magma-cuda112 and cuda:11.2.1-cudnn8-devel don’t mix?
How did you install magma-cuda112? If you’ve installed a conda packagge, did the install logs show that PyTorch would be downgraded to a CPU-only version?
If not, then I don’t think that magma-cuda112 should have influence in PyTorch being able to find a GPU.
Sorry missed the reply. It seems variable sometimes whether conda wants to downgrade or not. I never dug deep enough to work out why. But I did get it working without downgrade