I am trying to use multiple GPUs for a deep neural network. When I restrict to a neural network with forward and backward functions, torch nn dataparallel works fine. However, I would like to include one more function in the neural network like this
class MyNN(nn.module):
def __init__(self):
super().__init__()
self.layer = nn.Linear(10, 20)
def forward(self, x):
x = self.block(x)
return x
def evaluate(self,x):
return some_scalar_value(x,self)
The function evaluate returns some scalar value. I am wondering how to average over the values obtained by multiple GPUs? If I had to simply do forward and backward pass, I could simply use
model = MyNN()
model = torch.nn.Dataparallel(model,device_ids)