How to use only one of the RGB channels in forward function?

I want to build a CNN for image classification. The input images should be of type RGB but every channel is just the same grayscale image (intentionally). Nevertheless, I only want to use the first channel of these three in the forward function for the conv2d layers. I’ve tried something like:

...

def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, padding=2)
        ...

def forward(self, x):
        x = x[:,0]
        x = F.relu(self.conv1(x))
        ...

I’ve got the following error message while training (batch size is 64, images are 32x32):

RuntimeError: Given groups=1, weight of size [32, 1, 3, 3], expected input[1, 64, 30, 30] to have 1 channels, but got 64 channels instead

I can imagine some mistake in the slicing of x. I’ve tried different slicing versions (x[0], x[:,0], x[:,:,0]) for this reason but nothing worked.

Does anyone has a clue what I am doing wrong here?

The issue is that indexing x like this removes the ‘channels’ dimension. That is, if you run:

x = torch.rand(64, 3, 32, 32) # [batch_size, channels, height, width]
x = x[:, 0]
print(x.size())

The output is [64, 32, 32]. However, for x to be compatible with the convolutional layer, you would need shape [64, 1, 32, 32]. That can be done with ‘unsqueeze’, e.g.:

def forward(self, x):
    x = x[:,0].unsqueeze(1) # 3 channels -> 1 channel
    x = F.relu(self.conv1(x)
    ...
1 Like

@i4ata thanks for your help and the nice explanation!