RuntimeError: Unexpected error from cudaGetDeviceCount()

I was training GCN model on my Linux server and I suddenly got this error.

RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW

  • Pytorch version: 1.10.1+cu102
  • OS: Linux
  • Python version: Python 3.8.10
  • CUDA Version: 11.2

Is nvidia-smi returning any errors and complains about a driver mismatch? If so, could you restart the server and check if it helps? If not, did you recently update any drivers or are you manually trying to get forward compatibility working on non-server GPUs?

No, it doesn’t return any errors:

NVIDIA-SMI 450.57, Driver Version: 450.57 , CUDA Version: 11.2

I have restarted it many times but still the same problem.

I didn’t do any updates. I installed PyTorch and it’s installing successfully

Sir, by doing:

! python -c "import torch; print(torch.cuda.is_available())

I got:

/usr/local/lib/python3.8/dist-packages/torch/cuda/__init__.py:80: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW (Triggered internally at  ../c10/cuda/CUDAFunctions.cpp:112.)
  return torch._C._cuda_getDeviceCount() > 0
False

Based on this issue other users were running into the same error message if

  • their setup was broken due to a driver/library mismatch (rebooting seemed to solve the issue)
  • their installed drivers didn’t match the user-mode driver inside a docker container (and forward compatibility failed due to the usage of non-server GPUs)

Was your setup working before and if so, what changed?

Thank you Sir :). My problem is solved.
By doing:

!pip3 install torch==1.10.1+cu113 torchvision==0.11.2+cu113 torchaudio==0.10.1+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
!pip3 install torch-scatter -f https://data.pyg.org/whl/torch-1.10.0+cu113.html
!pip3 install torch-scatter torch-sparse -f https://data.pyg.org/whl/torch-1.10.0+cu113.html
!pip3 install torch-cluster -f https://data.pyg.org/whl/torch-1.10.0+cu113.html
!pip3 install torch-geometric