Hi everyone,
I am trying to prune different architectures as explained in this paper:
https://arxiv.org/pdf/1611.06440v1.pdf
Basically, I assign a score to each filter in every convolutional layer based on a given criterion, and then I remove the lowest k-ranked filters from the model.
After successfully applying this procedure to VGG16, I am now trying to do the same on ResNet18, but I am having some problems. I am using the model provided by the torchvision package with pretrained weights.
Let’s say that I erase 9 filters from the second conv layer: the problem arises when I try to retrain the model, as I get the following error:
running_mean should contain 55 elements not 64
I managed to isolate the problem to the following block of the architecture:
(4): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, **55**, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(**55**, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace)
(conv2): Conv2d(**55**, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
As you can see, I changed the shapes of the input/output channels accordingly to match the new dimension (55 in this case). I also modified the shapes of the weights of the convolutional layers and the weight/bias/running_mean/running_var of the batchnorm layer following the convolution. The error concerns the (bn2) layer, so there must be something I am missing about the Batchnorm.
Thank you in advance for your help!