How can I get the gradients as an integration with DataParallel (Multiple GPUs) according to input sequence?

I used DataParallel with 2 GPUs. I register_hook in model, like this:

def forward(self, x):

    x = self.features(x)
    x.register_hook(self.save_gradient)
    inter = x
  
    x = x.view(x.size(0), -1)
    x = self.classifier(x)

When I do the backward, the gradients will be split into two parts.
I tried to collect the grad in a list, but how can I make them an integration according to the input sequence? Does the DataParallel compute the gradient according to gpus_id order squentially?
Thanks!

I got the same problem, do you have a solution? Thanks.