Instantiate model layer by name

Hey guys!

My main goal is to reconstruct the forward propagation of a pretrained model from torchvision in a generic way, so that I can work on the inputs/weights/outputs of conv and fc layers online of any model I load (I’m not sure if hooks are useful here because as far as I understood you need to go over the whole model to get these values, and I need to work on inputs/weights/activations on the fly).

This is being annoying for 2 reasons. Let’s take the example of VGG11.

When I run the code

for x, y in tqdm(dataloader):
    for name, layer in model.named_modules():
        print(name)
    exit()

The output is

features
features.0
features.1
features.2
features.3
features.4
features.5
features.6
features.7
features.8
features.9
features.10
features.11
features.12
features.13
features.14
features.15
features.16
features.17
features.18
features.19
features.20
avgpool
classifier
classifier.0
classifier.1
classifier.2
classifier.3
classifier.4
classifier.5
classifier.6

Usually, when the name doesn’t have an index, like features, this name represents a group of layers, in this case

Sequential(
  (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): ReLU(inplace=True)
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (3): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (4): ReLU(inplace=True)
  (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (6): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (7): ReLU(inplace=True)
  (8): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (9): ReLU(inplace=True)
  (10): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (11): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (12): ReLU(inplace=True)
  (13): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (14): ReLU(inplace=True)
  (15): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (16): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (17): ReLU(inplace=True)
  (18): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (19): ReLU(inplace=True)
  (20): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)

But when it has an index, like features.0, it’s an actual layer. In this case, features.0 is

Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

This is not always the case since the layer avgpool doesn’t have index and it’s also not a group of layers.

So, the first question is how to instantiate actual layers so that I can just do a for loop and get the input and output of each layer. Running something like

correct=0
total=0
for x, y in tqdm(dataloader):
    for name, layer in model.named_modules():
        if is_actual_layer(layer):
                x = model[name](x)
    y_pred = x
    correct += (y_pred.argmax(axis=1) == y).sum().item()
    total += len(y)

That would be equivalent to

correct=0
total=0
for x, y in tqdm(dataloader):
    y_pred = model(x)
    correct += (y_pred.argmax(axis=1) == y).sum().item()
    total += len(y)

The second thing is that, usually, in the forward function, we have to flatten the vector for the classifier layers, since they are composed of linear layers. Is there any way to know when and how to flatten the input? For instance, in VGG11 forward function we have

def forward(self, x: torch.Tensor) -> torch.Tensor:
    x = self.features(x)
    x = self.avgpool(x)
    x = torch.flatten(x, 1) # input is flatten after avgpool layer
    x = self.classifier(x)
    return x

Any help is very welcome! Thank you

It’s not trivial since, as you’ve already pointed out, the functional API calls will be missing since these won’t show up in named_modules().
If you can guarantee your model only contains nn.Modules you could add a few checks for nn.Sequential etc. in is_actual_layer, but it seems you are more interested in seeing the computation graph. What’s your use case?

1 Like

Thanks for the quick answer,

Short answer:
I want to run inference on different models on ImageNet, and for each conv and fc layer, I want to apply kind of a quantization on the weights and outputs.

Long answer:
I’m gonna take the weights from each conv and fc layer, will write them on matrices such that each weight corresponds to a row in the matrix, and the columns of the matrix are negative powers-of-two multipliers. For example, take the weight matrix

(0.625 0.125)
(0.5 0 )

My matrix will be
2⁰ 2^(-1) 2^(-2)
(0 1 1) → 0.625
(0 0 1) → 0.125
(0 1 0) → 0.5
(0 0 0) → 0

This is kind of a quantization for the weights. Then I will multiply these weights with the input and will give me outputs. These outputs will be in the form of powers-of-two as well. I will first perform a standard quantization on it, approximating the result from each column to the nearest quantized region, and then I will convert from the power-of-two form to an actual number, and pass it to the next layer.

If you are curious, this is a way to simulate the hardware architecture I’m currently working on. Since it has these quantizations, it will harm the accuracy and I want to measure by how much.

I guess my major concern is that I don’t know the syntax to do this

Is there an actual way to perform this operation if I know the name of the layer? I believe I can restrict is_actual_layer to some cases as you pointed out.