Hi!
I have installed CUDA 12.1 on my machine with 8 x A100 GPUs, and installed the latest pytorch with CUDA 12.1 compatibility.
No matter what I have tried, I could not get rid of the following error, which makes me crazy:
Python 3.11.7 (main, Dec 15 2023, 18:12:31) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
/home/ubuntu/.build/miniconda3/envs/pytorch/lib/python3.11/site-packages/torch/cuda/__init__.py:138: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 802: system not yet initialized (Triggered internally at /opt/conda/conda-bld/pytorch_1702400430266/work/c10/cuda/CUDAFunctions.cpp:108.)
return torch._C._cuda_getDeviceCount() > 0
False
Appreciate your help, please!