HI, I try following code with multi gpus, but the command “loss.backward()” throws an error “RuntimeError: arguments are located on different GPUs. The pytorch version is 1.0.0. How can i do it?
from torch import nn
import torch
class MyModule1(nn.Module):
def __init__(self):
super(MyModule1, self).__init__()
self.fc1 = nn.Linear(2, 1)
def forward(self, x):
x = self.fc1(x)
return x
class MyModule2(nn.Module):
def __init__(self):
super(MyModule2, self).__init__()
self.fc2 = nn.Linear(1, 1)
def forward(self, x):
x = self.fc2(x)
return x
model1 = MyModule1()
model1.to('cuda:0')
x = torch.tensor([10, 5], dtype=torch.float).to('cuda:0')
y = model1(x)
z = y.to('cuda:1')
model2 = MyModule2()
model2.to('cuda:1')
w = model2(y)
w.backward()
@DoubtWang I think the problem is that you can not backward through two different devices. Namely input->device1->device2->output and output.backward shall stop at device2.
Autograd is able to create the backward pass through different devices.
The error in the first code was that y was passed to model2 (which was still on cuda:0), while z should be passed.
I’m wondering why the forward pass didn’t throw an error.
Anyway. after fixing this bug, the code should be working.
No, you just can calculate the loss etc. as usual.
You would just need to make sure the tensors and parameters are on the appropriate device.
In the example code you could just call w.backward() or calculate the loss with a target on GPU1 and call loss.backward().
Thanks for your example very much.
I get your point. need to make sure the tensors and parameters are on the appropriate device when model computing, need to make sure the finally out and target are on the same device when calling backward,
thanks again.