Changing dilation rate, stride in Conv Layer without changing its weights

lavish619 · July 25, 2021, 11:39am

I am using a pretrained resnet101 and I want to change the dilation rates and stride of some conv layers.

If I initialize the layers again, that will change the weights of that layer, but incase of stride or dilation rate change only, the weights should not get changed because the kernel size is same.

So how can I change the layer configuration without changing the weights keeping the kernel size same.

Currently, This is what I am doing, but it changes the weights.

self.resnet = torchvision.models.resnet101(pretrained=True)
for i in range(23):
    self.resnet.layer3[i].conv2 = nn.Conv2d(256, 256,
                                            kernel_size=3,
                                            stride=1, 
                                            padding=2,
                                            dilation=2, 
                                            bias=False)

KFrank · July 25, 2021, 6:18pm

Hi Lavish!

If it were me, I would probably do:

for i in range(23):
    saveWeight = self.resnet.layer3[i].conv2
    self.resnet.layer3[i].conv2 = nn.Conv2d (256, 256,
                                            kernel_size=3,
                                            stride=1, 
                                            padding=2,
                                            dilation=2, 
                                            bias=False)
    with torch.no_grad():
        self.resnet.layer3[i].conv2.weight.copy_ (saveWeight)

I think that the following would also work (but I haven’t tried it):

for i in range(23):
    self.resnet.layer3[i].conv2.dilation = (2, 2)

The issue is that I don’t know whether Conv2d does something
non-trivial with dilation when its constructor is called (other than
just self.dilation = dilation), and even if this scheme works
now, I don’t know whether it would be guaranteed to work in the future.

Best.

K. Frank

lavish619 · July 26, 2021, 4:54am

Hi, Thanks for the solution.

But I want to ask why have you used torch.no_grad() for copying the weights.?

KFrank · July 26, 2021, 1:55pm

Hi Lavish!

Try performing the weight-copy without torch.no_grad() and see.

Best.

K. Frank