RuntimeError: Unexpected error from cudaGetDeviceCount(); Error 802: system not yet initialized

Hi all,

I’m trying to set up the environment for CUDA before importing torch, but I’m running into the following error:

RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some CUDA functions before calling NumCudaDevices() that might have already set an error? Error 802: system not yet initialized.

Here’s the code I’m using:

import os

os.environ["CUDA_VISIBLE_DEVICES"] = "0,1,2,3,4,5,6,7"
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["PYTORCH_NVML_BASED_CUDA_CHECK"] = "1"

import torch
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
    print("Device count:", torch.cuda.device_count())
    print("Torch version:", torch.__version__)
    print("Current device:", torch.cuda.current_device())
    print("Device name:", torch.cuda.get_device_name(torch.cuda.current_device()))

Am I setting these environment variables correctly? Is there something I’m missing that could cause cudaGetDeviceCount() to fail with Error 802?

Any insights would be appreciated!

Thanks in advance

What does deviceQuery return? Assuming the same error, check if your system uses NVSwitch and if FabricManager was properly installed.