Conv2d set weights and output channels of different sizes

Levon · January 9, 2021, 7:58pm

Hi,

Please help me solve the following confusion:
I have 4 filters of (4,4) size. The values of these filters I want to assign to my conv2d weights and then visualize them.
Below I define my model for greyscale image. When I run the code - it works correctly.
But when I change the out_channels from 4 to 5 or 2, I expect the model to stop working as I initialize weights for 4 filters, but in output expect 5 or 2. But my model still works, returns 4 layes and no error is thrown. Why this happens?

import torch
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    
    def __init__(self, weight):
        super(Net, self).__init__()
        # assumes there are 4 grayscale filters
        self.conv = nn.Conv2d(1, 4, kernel_size=(4, 4), bias=False)
        self.conv.weight = torch.nn.Parameter(weight)

    def forward(self, x):
        conv_x = self.conv(x)
        activated_x = F.relu(conv_x)
        return conv_x, activated_x
    
weight = torch.from_numpy(filters).unsqueeze(1).type(torch.FloatTensor)
model = Net(weight)
print(model)

ptrblck · January 18, 2021, 8:08am

Your code is not fully defined, as e.g. filters are missing.
However, it seems to work fine and the output shape corresponds to the passed number of filters as seen here:

class Net(nn.Module):
    def __init__(self, weight):
        super(Net, self).__init__()
        # assumes there are 4 grayscale filters
        self.conv = nn.Conv2d(1, 4, kernel_size=(4, 4), bias=False)
        self.conv.weight = torch.nn.Parameter(weight)

    def forward(self, x):
        conv_x = self.conv(x)
        activated_x = F.relu(conv_x)
        return conv_x, activated_x


weight = torch.randn(4, 1, 4, 4)
model = Net(weight)
x = torch.randn(1, 1, 24, 24)
out = model(x)
print(out[0].shape)
> torch.Size([1, 4, 21, 21])

weight = torch.randn(5, 1, 4, 4)
model = Net(weight)
out = model(x)
print(out[0].shape)
> torch.Size([1, 5, 21, 21])

Levon · January 18, 2021, 9:01pm

Hi, Thanks for the answer.
my point is that if I use nn.Conv2d(1, 5, kernel_size=(4, 4), bias=False) , i.e. output channels 5, but still remain with lets say 4 filters weight = torch.randn(4, 1, 4, 4) ), it still works (I was assuming that I must provide 5 filters). Probaly it is allowed to overwrite 5 output channels to 4 (please let me know whether I am right or wrong). It was very strange to me and this behaviour is prone to potential bugs.

ptrblck · January 19, 2021, 12:40am

Thanks for the follow-up. I assume you think it should be disallowed to pass a new weight kernel with a different number of filters, since the original layer was initialized with another number of kernels?
If so, then note that it’s not disallowed to manipulate the underlying parameters (at your own risk).
While this might be prone to potential bugs, I would consider this an advanced approach and don’t think we should disallow it as it could limit potentially valid use cases.