Hi!
I am struggling with a virtualization issue. We set up a H100 with 5 partitions. There is no other GPU installed, so only 1 physical device. We are not able to set more than one vGPU as CUDA_VISIBLE_DEVICES. The first one is used, no matter what. I read a lot about pytorch is only able to see one physical gpu if one is in MIG mode and the others aren’t. But holds this true for a single GPU with virtualization?
Examples:
CUDA_VISIBLE_DEVICES=“MIG-f7267b59-7dd4-5be3-8d25-16ceda48e4ba,MIG-c0d67c52-1804-5a4f-bae7-752caa2b8f21” python test.py
Is cuda available? True
Device count? 1
CUDA_VISIBLE_DEVICES=“0,1” python test.py
Is cuda available? True
Device count? 1
where test.py is
import torch
print(“Is cuda available?”, torch.cuda.is_available())
print(“Device count?”, torch.cuda.device_count())
Thank you very much!