Using hook method for extracting layers in a pretrained model in data parallelism

Tm4 · August 29, 2022, 7:41am

A pretrained model, as my encoder, has several layers as follows :

TimeSformer(
  (model): VisionTransformer(
(dropout): Dropout(p=0.0, inplace=False)
(patch_embed): PatchEmbed(
  (proj): Conv2d(3, 768, kernel_size=(16, 16), stride=(16, 16))
)
(pos_drop): Dropout(p=0.0, inplace=False)
(time_drop): Dropout(p=0.0, inplace=False)
(blocks): ModuleList(  #************
  (0): Block(
    (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
    (attn): Attention(
      (qkv): Linear(in_features=768, out_features=2304, bias=True)
      (proj): Linear(in_features=768, out_features=768, bias=True)
      (proj_drop): Dropout(p=0.0, inplace=False)
      (attn_drop): Dropout(p=0.0, inplace=False)
    )
    (temporal_norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
    (temporal_attn): Attention(
      (qkv): Linear(in_features=768, out_features=2304, bias=True)
      (proj): Linear(in_features=768, out_features=768, bias=True)
      (proj_drop): Dropout(p=0.0, inplace=False)
      (attn_drop): Dropout(p=0.0, inplace=False)
    )
    (temporal_fc): Linear(in_features=768, out_features=768, bias=True)
    (drop_path): Identity()
    (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
    (mlp): Mlp(
      (fc1): Linear(in_features=768, out_features=3072, bias=True)
      (act): GELU()
      (fc2): Linear(in_features=3072, out_features=768, bias=True)
      (drop): Dropout(p=0.0, inplace=False)
    )
  )
  (1): Block(
    (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
    (attn): Attention(
      (qkv): Linear(in_features=768, out_features=2304, bias=True)
      (proj): Linear(in_features=768, out_features=768, bias=True)
      (proj_drop): Dropout(p=0.0, inplace=False)
      (attn_drop): Dropout(p=0.0, inplace=False)
    )
    (temporal_norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
    (temporal_attn): Attention(
      (qkv): Linear(in_features=768, out_features=2304, bias=True)
      (proj): Linear(in_features=768, out_features=768, bias=True)
      (proj_drop): Dropout(p=0.0, inplace=False)
      (attn_drop): Dropout(p=0.0, inplace=False)
    )
    (temporal_fc): Linear(in_features=768, out_features=768, bias=True)
    (drop_path): DropPath()
    (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
    (mlp): Mlp(
      (fc1): Linear(in_features=768, out_features=3072, bias=True)
      (act): GELU()
      (fc2): Linear(in_features=3072, out_features=768, bias=True)
      (drop): Dropout(p=0.0, inplace=False)
    )
  )
.
.
.
.
.
.

I need to extract features of specific layers of the above mentioned model and I am using hook method as follows:

import torch.nn as nn
class my_class(nn.Module):
    def __init__(self, pretrained=False):
        super(my_class, self).__init__()
        
        
        self.featureExtractor =TimeSformer(img_size=224, num_classes=400, num_frames=8, attention_type='divided_space_time',  
                                           pretrained_model='/home/TimeSformer_divST_16x16_448_K400.pyth')
        
        
        self.featureExtractor=nn.DataParallel(self.featureExtractor)
        
        self.list = list(self.featureExtractor.children())
        
        self.activation = {}
        def get_activation(name):
            def hook(model, input, output):
              self.activation[name] = output.detach()
            return hook

      
        self.featureExtractor.model.blocks[4].register_forward_hook(get_activation('block4'))
        self.featureExtractor.model.blocks[8].register_forward_hook(get_activation('block8'))
        self.featureExtractor.model.blocks[4].temporal_attn.register_forward_hook(get_activation('block4.temporal_attn'))
        self.featureExtractor.model.blocks[11].register_forward_hook(get_activation('block11'))
        
        
    def forward(self, x, out_consp = False):
        
       b = self.featureExtractor(x)


        
       block4_output_Temporal_att = self.activation['block4.temporal_attn']
       block4_output = self.activation['block4']

       block8_output = self.activation['block8']
       block11_output = self.activation['block11']     
.
.
.
.

The problem is I face the following error:

File “/home/TimeSformer/models/timesformer.py”, line 30, in init self.featureExtractor.model.blocks[4].register_forward_hook(get_activation(‘block4’))

File “/home/miniconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py”, line 1185, in getattr raise AttributeError(“‘{}’ object has no attribute ‘{}’”.format( AttributeError: ‘DataParallel’ object has no attribute ‘model’

How can I solve the problem

ptrblck · August 29, 2022, 11:13pm

DataParallel wrapps the model into an internal .module attribute, so you could try to register the hooks using this additional attribute or by removing nn.DataParallel.