Seeing inconsistent behaviors with register_forward_hook with pretrained networks

bigharryox · April 2, 2020, 10:22pm

I have been looking at adding layer output visualizations to my models and have been working through doing this on pretrained networks modeled after some approaches i found online.

That said, i found that some of the pretrained Conv2D layers are outputting an activation of 0 when running a random image through and while expermenting with why, i found that i would get inconsistent outputs depending on which layers I activated.

class HookedLayerRunner():
    def __init__(self, model, selected_layer):
        self.model = model
        self.selected_layer = selected_layer
        self.layer_output = None

    def hook_layer(self):
        def hook_function(module, grad_in, grad_out):
            self.layer_output = grad_out[0]
        self.model[self.selected_layer].register_forward_hook(hook_function)

    def process(self, img):
        self.hook_layer()
        processed_image = process_image(img, device)
        self.model(processed_image)

And the code exercising the apparent bug.

filt = 10
model = models.vgg16(pretrained=True)
img = np.uint8(np.random.uniform(150, 180, (100, 100, 3)))/255.0 - 0.5
act = HookedLayerRunner(model.features, 0)
act.process(img)
print(act.layer_output[filt].mean())

t = process_image(img, device)
conv = model.features[0]
res = conv(t)
print(act.layer_output[filt].mean())

conv = nn.Sequential(model.features[0], model.features[1])
res = conv(t)
print(act.layer_output[filt].mean())

and the results, showing inconsistency depending on whether the model runs layer one in a nn.Sequential layer (as is done in the vgg16 pretrained model) or the conv2d layer is run as a singleton.

tensor(0.7511, grad_fn=<MeanBackward0>)
tensor(0.7456, grad_fn=<MeanBackward0>)
tensor(0.7511, grad_fn=<MeanBackward0>)

Any thoughts on what’s up in this? Am i just using the hook incorrectly, or is there a behavior to hooking the forward output which is dependent on the subsequent layers (a Relu in this case)?

bigharryox · April 3, 2020, 8:13am

Tried with different layer combinations and found some interesting differences. The issue seems to arise only when the following layer in the sequential is a ReLU. Output from different models being run:

Whole Model 0.7501
Sequential(Conv2D) 0.7446
Sequential(Conv2D, ReLU) 0.7501
Manual sequential1 (Conv2D, ReLU) 0.7446
Manual sequential2 (Conv2D, ReLU) 0.7501
Sequential(Conv2D, Conv2D, ReLU) 0.7446

I can’t figure out how putting a Relu after the convolution would change the output of the convolution from the hook. Even manually running the sequence of layers seems to exercise the issue…