RuntimeError: Given groups=1, weight[64, 3, 3, 3], so expected input[16, 64, 256, 256] to have 3 channels, but got 64 channels instead

Why am I getting this error?

RuntimeError: Given groups=1, weight[64, 3, 3, 3], so expected input[16, 64, 256, 256] to have 3 channels, but got 64 channels instead

I wrote an implementation of U-net.

class double_conv(nn.Module):
  def __init__(self, in_ch, out_ch):
    super(double_conv, self).__init__()
    self.conv1 = nn.Conv2d(in_ch, out_ch, 3, padding=1)
    self.conv2 = nn.Conv2d(in_ch, out_ch, 3, padding=1)
  def forward(self, x):
    x = F.relu(self.conv1(x))
    x = F.relu(self.conv2(x))
    return x
class input_conv(nn.Module):
  def __init__(self, in_ch, out_ch):
    super(input_conv, self).__init__()
    self.inp_conv = double_conv(in_ch, out_ch)
  def forward(self, x):
    x = self.inp_conv(x)
    return x

class up(nn.Module):
  def __init__(self, in_ch, out_ch):
    super(up, self).__init__()
    self.up_conv = nn.ConvTranspose2d(in_ch, out_ch, kernel_size=2, stride=2)
    self.conv = double_conv(in_ch, out_ch)
  def forward(self, x1, x2):
    x1 = self.up_conv(x1)
    x =[x2, x1], dim=1)
    x = self.conv(x)
    return x
class down(nn.Module):
  def __init__(self, in_ch, out_ch):
    super(down, self).__init__()
    self.pool = nn.MaxPool2d(2)
    self.conv = double_conv(in_ch, out_ch)
  def forward(self, x):
    x = self.pool(x)
    x = self.conv(x)
    return x
class last_conv(nn.Module):
  def __init__(self, in_ch, out_ch):
    super(last_conv, self).__init__()
    self.conv1 = nn.Conv2d(in_ch, out_ch, 1)
  def forward(self, x):
    x = self.conv1(x)
    return x

class Unet(nn.Module):
  def __init__(self, channels, classes):
    super(Unet, self).__init__()
    self.inp = input_conv(channels, 64)
    self.down1 = down(64, 128)
    self.down2 = down(128, 256)
    self.down3 = down(256, 512)
    self.down4 = down(512, 1024)
    self.up1 = up(1024, 512)
    self.up2 = up(512, 256)
    self.up3 = up(256, 128)
    self.up4 = up(128, 64)
    self.out = last_conv(64, classes)
  def forward(self, x):
    x1 = self.inp(x)
    x2 = self.down1(x1)
    x3 = self.down2(x2)
    x4 = self.down3(x3)
    x5 = self.down4(x4)
    x = self.up1(x5, x4)
    x = self.up2(x, x3)
    x = self.up3(x, x2)
    x = self.up1(x, x1)
    x = self.out(x)
    return x

model = Unet(3, 1)

This is the training loop

for epoch in range(5):
    for i, data in enumerate(trainloader):
        inputs, labels = data
        inputs = Variable(inputs).cuda()
        labels = Variable(labels).cuda()
        # forward + backward + optimize
        # zeroes the gradient buffers of all parameters
        #forward pass
        outputs = model_pytorch(inputs)
        # calculate the loss
        loss = loss_function(outputs, labels)
        # backpropagation
        # Does the update after calculating the gradients
        if (i+1) % 5 == 0: # print every 100 mini-batches
            print('[%d, %5d] loss: %.4f' % (epoch, i+1,[0]))
1 Like

It means your input should have 3 channels , but you give a 64 channels input. The input are organized in [N, C, W, H] format, your input, also data layer, should have 3 channels. You should check your code.

My input does have 3 channels. The input to the Unet is 3, 1 which corresponds to 3 channels and 1 class.

I believe in @junyuseu’s answer input means input to conv layer, not the input to the actual net work. In fact, unless your in_ch always equals to out_ch, the double_conv module will definitely throw such error.

The error is here. In the double_conv class, the first conv layer’s input dim is in_ch(3), out dim is out_ch(64), the second conv layer the same, but this layer’s input dim is 64, not in_ch(3)


Yup, made a silly mistake. Thanks for the help

i got the same kind of runtime error " Given groups=1, weight of size [64, 3, 7, 7], expected input[1, 4, 224, 224] to have 3 channels, but got 4 channels instead "

Been trying to correct it for a while now but cant seem to see where the mistake is .
This is my model arch.

any help would be appreciated

Your input contains 4 channels, while the first conv layer expects an input with 3 channels.
If you are dealing with RGB images, the 4th channel might be the alpha channel, which could just be removed.
If you are using a custom DataLoader, you could probably just use:

def __getitem__(self, index):
    img =[index]).convert('RGB')
    # Alternatively remove the alpha channel from the tensor
    img =[index])
    x = TF.to_tensor(img)
    x = x[:3]
1 Like

Thanks for your response! I had a different but related question.

If you are dealing with a 1-channel grayscale image but want to utilize a pretrained network (I am working off of this resnet18 based repo

Would you suggest duplicating the channel 3 times so it fits the model or is there a better approach?

I think this would be the easiest approach.
Alternatively, you could also try to reduce the filter channels (mean, sum, ?) and see, if that might give a better performance (I haven’t compared these approaches yet).

1 Like