The following code always allocates memory on device cuda:0, but I want to use another GPU for training
import torch
a = torch.randn([20000, 20000])
a.pin_memory() # <- this operation allocates memory on “cuda:0”, but I want to leave it unused.
b = a.cuda(‘cuda:1’)
This most likely happens because to get pinned memory, we need a cuda context. And so we initialize a cuda context on the current device when it is needed. Which would be cuda:0 here by default.
Changing the device will help.
Also if you never want to touch cuda:0, a good practice is to use the CUDA_VISIBLE_DEVICES=1 environment variable. This acts on the nvidia driver for the current process and hides the other GPUs. That way, you are sure that you never use them.