Multiple GPU error _ not on the same device

Hi,

I am running the code below on multiple GPU mode. When I take the x,y variables either to cuda() or not, I can not get the whole network running on the same device. Depending on the device, I get this error.

RuntimeError: expected device cuda:1 but got device  cuda:0
RuntimeError: expected device cuda:0 but got device cpu
# x = torch.linspace(0, args.max-1, args.max).cuda()
# y = torch.linspace(0, args.max-1, 4*args.max).cuda()

x = torch.from_numpy(np.linspace(0, args.max-1, args.max))
y = torch.from_numpy(np.linspace(0, args.max-1, 4*args.max))

model = network(args.max, x, y)
model = nn.DataParallel(model)
model.cuda()

I have tried taking x, y into the network, both in cuda and cpu mode, but the same error occurs again.

I would appreciate your help.

Are you using any cuda() or to() calls inside your model (__init__ or forward)?
If so, could you remove them as they might create the device mismatch.
nn.DataParallel will automatically create model replicas for you and you don’t need to push internal model parameters to a specific device manually (only the nn.DataParallel model to the default device).

1 Like

Yes, I was using them in the __init__. Now, I tried taking them into the forward in the model, and the problem seems to be solved.

Thanks!