It seems that when you deepcopy a tensor, it will by default create a copy on the first GPU, even if the tensor has been allocated to a specific GPU.
from copy import deepcopy
import torch
x = torch.ones((1,), device=torch.device("cuda", 1))
print(x)
## result : tensor([ 1.], device='cuda:1')
y = deepcopy(x)
print(y)
## result : tensor([ 1.], device='cuda:0')
This code will copy x onto the first the GPU with cuda:id 0.
Any idea on what might be causing that, and any way to work around this issue (if it is not the intended behavior) ?
I took the example of a tensor, but I want to copy an entire network instead, without having to copy it on the first GPU and then reassign it to another GPU by hand.
I experienced the same issue and hence filed a report here:
Another problem with deepcopy (https://github.com/pytorch/pytorch/issues/315) was just recently fixed, but not yet available on the most recent stable version of pytorch, i.e. 0.4.1.