I used DataParallel with 2 GPUs. I register_hook in model, like this:
def forward(self, x):
x = self.features(x) x.register_hook(self.save_gradient) inter = x x = x.view(x.size(0), -1) x = self.classifier(x)
When I do the backward, the gradients will be split into two parts.
I tried to collect the grad in a list, but how can I make them an integration according to the input sequence? Does the DataParallel compute the gradient according to gpus_id order squentially?