Pruning ResNet18 model

Hi everyone,
I am trying to prune different architectures as explained in this paper:
https://arxiv.org/pdf/1611.06440v1.pdf

Basically, I assign a score to each filter in every convolutional layer based on a given criterion, and then I remove the lowest k-ranked filters from the model.

After successfully applying this procedure to VGG16, I am now trying to do the same on ResNet18, but I am having some problems. I am using the model provided by the torchvision package with pretrained weights.

Let’s say that I erase 9 filters from the second conv layer: the problem arises when I try to retrain the model, as I get the following error:

running_mean should contain 55 elements not 64 

I managed to isolate the problem to the following block of the architecture:

  (4): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(64, **55**, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(**55**, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace)
      (conv2): Conv2d(**55**, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )

As you can see, I changed the shapes of the input/output channels accordingly to match the new dimension (55 in this case). I also modified the shapes of the weights of the convolutional layers and the weight/bias/running_mean/running_var of the batchnorm layer following the convolution. The error concerns the (bn2) layer, so there must be something I am missing about the Batchnorm.
Thank you in advance for your help!

Your workflow sounds reasonable. Could you post a small code snippet reproducing this issue, if that’s possible?
Also, are you sure the error points to bn2?
Make sure to call your script using CUDA_LAUNCH_BLOCKING=1 python script.py args if you are using your GPU. Otherwise due to the asynchronous CUDA calls the stack trace might point to a wrong line of code.

Thanks for the reply. I managed to solve the issue, which was due to a mistake on my part, since I was modifying both the input and output shape of the weights of the convolutional layers.

This works fine with me:

import torch
from torch import nn
import torchvision.models as models
import torch.nn.utils.prune as prune

resnet18 = models.resnet18(pretrained=True)

parents = [module for module in resnet18.modules()]

leaves = []

while parents != []:

m = parents.pop(0)

if list(m.children()) == []:

leaves.append(m)

else:

for x in m.modules():

  if list(x.children()) == []:

    leaves.append(x)

  else:

    parents.append(x)

parameters_to_prune = [(module, ‘weight’) for module in leaves\

                   if (isinstance(module, torch.nn.modules.conv.Conv2d) or isinstance(module, torch.nn.modules.Linear))]

prune.global_unstructured(

    parameters_to_prune,

    pruning_method=prune.L1Unstructured,

    amount=0.2)