Hello everybody,
I am just having some issues executing a custom module on multiple GPUs.
Here is an equivalent sample of the code I am trying to debug:
class fooModule(nn.Module):
def __init__(self):
super(fooModule, self).__init__()
self.first=True
def forward(self, input):
if self.first==True:
print('First time in fooModule')
self.first=False
else:
print('NOT first time in fooModule')
When I execute the following:
net=fooModule()
x=Variable(torch.Tensor(1))
for i in range(200):
net(x)
the output is:
First time in fooModule
NOT first time in fooModule
NOT first time in fooModule
NOT first time in fooModule
NOT first time in fooModule
NOT first time in fooModule
NOT first time in fooModule
NOT first time in fooModule
…
If instead I try with:
net=fooModule()
net=torch.nn.DataParallel(net).cuda()
x=Variable(torch.Tensor(1)).cuda()
the output is:
First time in fooModule
First time in fooModule
First time in fooModule
First time in fooModule
First time in fooModule
First time in fooModule
First time in fooModule
First time in fooModule
First time in fooModule
First time in fooModule
First time in fooModule
…
I am not an expert with PyTorch, thus I was wondering if there is something wrong that I am not able to see yet in the implementation.
What I expect is exactly the output of the model running on the CPU.
Thank you in advance, this is the best deep learning framework I have ever used, especially for your documentation and availability in helping PyTorch users.
Jak94