I found CUDA_VISIBLE_DEVICES
is no longer valid after calling torch.cuda.is_available()
or torch.cuda.device_count()
. It seems like these two functions will freeze CUDA_VISIBLE_DEVICES
after the first call. Is this an intended behavior or bug? I found this behavior caused some trouble in torch_xla with multi-processing as discussed here Calling torch.cuda.is_available() with multiprocessing exhausts memory. · Issue #3347 · pytorch/xla · GitHub.
As explained in the linked issue, CUDA_VISIBLE_DEVICES
has to be set before the first CUDA call.
1 Like