Hey guys!
My main goal is to reconstruct the forward propagation of a pretrained model from torchvision in a generic way, so that I can work on the inputs/weights/outputs of conv and fc layers online of any model I load (I’m not sure if hooks are useful here because as far as I understood you need to go over the whole model to get these values, and I need to work on inputs/weights/activations on the fly).
This is being annoying for 2 reasons. Let’s take the example of VGG11.
When I run the code
for x, y in tqdm(dataloader):
for name, layer in model.named_modules():
print(name)
exit()
The output is
features
features.0
features.1
features.2
features.3
features.4
features.5
features.6
features.7
features.8
features.9
features.10
features.11
features.12
features.13
features.14
features.15
features.16
features.17
features.18
features.19
features.20
avgpool
classifier
classifier.0
classifier.1
classifier.2
classifier.3
classifier.4
classifier.5
classifier.6
Usually, when the name doesn’t have an index, like features, this name represents a group of layers, in this case
Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace=True)
(2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(3): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(4): ReLU(inplace=True)
(5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(6): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(7): ReLU(inplace=True)
(8): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(9): ReLU(inplace=True)
(10): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(11): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(12): ReLU(inplace=True)
(13): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(14): ReLU(inplace=True)
(15): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(16): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(17): ReLU(inplace=True)
(18): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(19): ReLU(inplace=True)
(20): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
But when it has an index, like features.0, it’s an actual layer. In this case, features.0 is
Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
This is not always the case since the layer avgpool doesn’t have index and it’s also not a group of layers.
So, the first question is how to instantiate actual layers so that I can just do a for loop and get the input and output of each layer. Running something like
correct=0
total=0
for x, y in tqdm(dataloader):
for name, layer in model.named_modules():
if is_actual_layer(layer):
x = model[name](x)
y_pred = x
correct += (y_pred.argmax(axis=1) == y).sum().item()
total += len(y)
That would be equivalent to
correct=0
total=0
for x, y in tqdm(dataloader):
y_pred = model(x)
correct += (y_pred.argmax(axis=1) == y).sum().item()
total += len(y)
The second thing is that, usually, in the forward function, we have to flatten the vector for the classifier layers, since they are composed of linear layers. Is there any way to know when and how to flatten the input? For instance, in VGG11 forward function we have
def forward(self, x: torch.Tensor) -> torch.Tensor:
x = self.features(x)
x = self.avgpool(x)
x = torch.flatten(x, 1) # input is flatten after avgpool layer
x = self.classifier(x)
return x
Any help is very welcome! Thank you