How to delete some layers in pretrained model?

SungmanHong · May 28, 2021, 7:18am

I have mobilenetV3Small Model

MobileNetV3(
  (features): Sequential(
    (0): ConvBNActivation(
      (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (1): BatchNorm2d(16, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
      (2): Hardswish()
    )
    (1): InvertedResidual(
      (block): Sequential(
        (0): ConvBNActivation(
          (0): Conv2d(16, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=16, bias=False)
          (1): BatchNorm2d(16, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
        )

But I want to remove BatchNorm2D layer only… in the pretrained model.
And Save as New model as mine.
How can I…?

ptrblck · May 28, 2021, 8:45am

There are several options to remove/replace this layer depending on your use case.

You could copy the source code of the model, remove the layer, change the forward method, and store this new model definition as a custom model. This would allow you to load this new model in your scripts and just train it if needed. The state_dict would not contain this layer (and in particular its parameters) anymore and you would not be able to directly load a pretrained state_dict anymore (using strict=False as a workaround should work).

Alternatively, you could also load the original model and replace the unwanted layer with an nn.Identity. While this approach might be simpler, you would need to replace this layer after each model creation.

SungmanHong · May 29, 2021, 10:17am

Thank you! I found source code of MobileNetV3Small and
just removed some layer, it also works well.
However, I can’t use pretrained network and it has overfitting problem in the validation data set…

==> As you mentioned, when loading the pretrained model at load_state_dict,
using strict=False removed unwanted layer as Magic!!! Thank you!!! Pytorch is very Nice tool!!
Thank you!!! Thank you for saving my life!!! Thank you!!!

sio277 · May 31, 2021, 1:09pm

@ptrblck Replacing the unwanted layer with nn.Sequential() will do the same thing as the nn.Identity() without any side effects?

ptrblck · May 31, 2021, 10:44pm

Yes, it would work at the moment, but I would stick to the nn.Identity workflow as it seems to be cleaner than depending on an empty nn.Sequential container to return the input activation.

sio277 · May 31, 2021, 10:49pm

Oh I see. Thank you🙂

sio277 · May 31, 2021, 11:23pm

@ptrblck Could I ask more questions about replacing the layer?

Let me assume a simple model:

import torch
import torch.nn as nn


class Foo(nn.Module):
    def __init__(self):
        super(Foo, self).__init__()
        self.bar = nn.Conv2d(1, 2, 3)
        
foo = Foo()
foo.bar = nn.Linear(4, 5)  # replacement

When replacing the foo.bar layer using assignment, do we need the guard of torch.no_grad() ? I’m concerned it as the new parameters are coming and the old ones are gone. Or we don’t need it as it is just a replacement of the namespace (foo.bar.__dict__)? When replacing the parameter itself, we needed torch.no_grad() as you mentioned in here: How to assign an arbitrary tensor to model's parameter? - #9 by ptrblck.
Are the parameters inside the old layer (here the convolutional layer) freed from memory (because they can no longer be pointed to)?

ptrblck · June 1, 2021, 1:29am

I would wrap it in a no_grad() block, to make sure that this assignment is not tracked by Autograd.
If these parameters are not referenced anymore, they’ll be freed. E.g. if you’ve passed them to an optimizer before replacing them, they might be kept alive (and the new parameters would have to be added to the optimizer).

sio277 · June 1, 2021, 4:53am

Thank you very much for your kind explanation