Hi, I want to use hook function on DataParallel instance, but I have some uncertain points.
I think below code would work well, but I’m not sure about
Do I need to set lock when I access self.target_outputs and self.target_outputs_grad in the hook function?
Is it guaranteed that inputs are scattered to each GPU following original order? For example, if the inputs were [1, 2, 3, 4] and there are 2 GPUs, Do [1, 2] are fed to GPU #1 and [3, 4] are fed to GPU #2?
It seems inputs are scattered to GPU in its original order, so aggregating the outputs of each GPU in GPU # order would be matched with original inputs. Indeed, I can find it at gather of nn.DataParallel.
So would you please show us what your code looks like finally.
I also tried to get intermedia outputs by forward hook using multi-gpus but a weird thing happened.
In init, I initialize self.target_output as None.
In hook function, I print self.target_output which is not None when hook function is excuted.
But after self.model.forward() excuted, self.target_output turn back to be None.
One of two forwards output None which is vary weird.
I implement this using data_parellel function and I regard class Wrapper as a NN Module, which return self.target_output. So data_parallell will scatter and gather all the output of Wrapper.
code looks like:
import torch
from torchvision.models.vgg import vgg19
class Wrapper(torch.nn.Module):
def __init__(self, model):
super(Wrapper, self).__init__()
self.model = model
self.target_outputs = None
def forward_hook(_, __, output):
self.target_outputs = output.detach()
self.model.features[2].register_forward_hook(forward_hook)
def forward(self,input):
self.model(input)
return self.target_outputs
model = vgg19()
model = model.cuda(4)
wrapper = Wrapper(model)
devices = [4, 5]
inputs = torch.randn(60,3,224,224).fill_(0).cuda(4)
out = torch.nn.parallel.data_parallel(wrapper, inputs, devices)
print("first forward: ", out)
inputs = torch.randn(60,3,224,224).fill_(1).cuda(4)
out = torch.nn.parallel.data_parallel(wrapper, inputs, devices)
print("second forward: ", out.shape)
output is:
first forward: None
second forward: (60, 64, 224, 224)
By compare the result with single-gpu result, I found the result of second forward is indeed the result of first forward .
There is a solution to get intermedia output by forward hooks using multi-gpus in this post although not so elegant. It dose work in my test.
But the weird phenomenon explained in above post still confuses me.
Hi,
Thanks for the amazing solution. It works correctly for the forward pass.
I am using forward along with backward and doing certain operations. So, each time, when the backward is called, it creates a new thread and the output of backward is saved on that thread. My problem is how do I get the backward thread ID for the corresponding forward thread ID.
self.forward_hook = self.target_layer.register_forward_hook(self.hook_fn_act)
self.backward_hook = self.target_layer.register_backward_hook(self.hook_fn_grad)
def hook_fn_act(self, module, input, output):
self.activations[threading.get_native_id()] = output.detach()
def hook_fn_grad(self, module, grad_input, grad_output):
self.gradients[threading.get_native_id()] = grad_output[0].detach()
def foo():
<Certain operations using the self.activations and self.gradients>
<Problem is that thread id for the activations and gradients is not same, so how to know which
thread ID of activations maps to which one of gradients>