It’s also possible to run into this with bad conda environments.
For me
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "2" # just use one GPU on big machine
import torch
assert torch.cuda.device_count() == 1
Failed, but it was because my environment was problematic, and only
import torch
print(torch.cuda.current_device())
actually raised an error.