I have been given access to a GPU cluster where the GPUs (2x NIVIDIA A100 80GB) are partitioned using MIG to partition their GPUs into sub-elements…
Unfortunately, the I cannot find an example which can show me how to access the part via a given UUID of the sub element (MIG-11c29e81-e611-50b5-b5ef-609c0a0fe58b)… Or rather how to tell torch to use that?
device(“cuda:0”) would not be enough, it only describes the GPU the partition is placed on…
This variable does not exist… StackOverflow suggests to export like “export CUDA_VISIBLE_DEVICES=0” but what would then CUDA_VISIBLE_DEVICES give me? I already know all ids?