Issue with Registering Forward Hooks in a Transformer


I am currently trying to track layer activations as part of a project I’m working on.

Previously I had no issues with using the register_forward_hooks method for convolutional layers in AlexNet. I am now trying to reuse this code for a transformer layer in ViT_b32 but the activations dict is left empty. Below is an excerpt of the code:

def register_hooks(model_layers, features, feature_layer):
def get_features(name):
def hook(model, input, output):
features[name] = output.detach()
return hook

model = models.vit_b_32()
feature_layer = ‘encoder_layers_encoder_layer_11_self_attention_out_proj’
layers = get_model_layers(model) #This just stores layers in a dict with key = layer name and val = layer

activations = {}
register_hooks(layers, activations, feature_layer)

out = model(img_tensor)

print(f’activations dict size: {len(activations)}')

This output a dict size of 0 for ViT and 1 for Alexnet, indicating that the hook is not registering as intended.

Any advice offered would be greatly appreciated, and if you need additional information please don’t hesitate to ask!

Your code is not executable as e.g. model_layers is undefined and it’s also not properly formatted.
However, it seems you want to access the NonDynamicallyQuantizableLinear layer, which is only used for error handling in the MultiHeadAttention layer.
This code works for me:

activation = {}
def get_activation(name):
    def hook(model, input, output):
        activation[name] = output[0].detach()
    return hook

model = models.vit_b_32()

x = torch.randn(1, 3, 224, 224)
out = model(x)
# torch.Size([1, 1000])
# torch.Size([1, 50, 768])