Using forward hook for data parallel with multiple GPUs

bellos1203 · January 5, 2021, 8:58am

Hi, I’m trying to extract intermediate features from general ResNet.
I’m able to extract intermediate feature using the following code snippet with a single gpu.
The following code snippet is part of my own network module class, which uses ResNet as self.base_model.

self.hooks = [None]
def set_hooks(self):

    self._activations = [None]

    def forward_hook(module, input, output):
       self.activations[0] = output
       return None

    self._hooks[0] = self.base_model.layer4.register_forward_hook(forward_hook)

After a single forward step, I call model.activations[0] to extract the intermediate features.

The problem is, when I modify my code to use nn.DataParallel to use multiple gpus, model.module.activations[0] only returns the results of the first gpu (I’m not sure whether it is the result from the first gpu or other gpus).

How can I get the results of the forward hook from all of the gpus?
One possible option is to change the code by appending the result of the forward hooks to a list first and then getting the results by indexing the list, but I’m not sure whether the order (or rank) of the gpus is guaranteed at every iteration.

I’d greatly appreciate any help you can offer.
Thank you.

bellos1203 · January 8, 2021, 5:38am

For those who still need help, I solved the problem by applying the solution from https://discuss.pytorch.org/t/aggregating-the-results-of-forward-backward-hook-on-nn-dataparallel-multi-gpu/28981/10.