RuntimeError: Given weight of size 3 1 5 5, expected bias to be 1-dimensional with 3 elements, but got bias of size [6] instead

class LeNet(nn.Module):
def init (self):
super(LeNet, self). init ()
self.conv1 = nn.Conv2d(1, 6, 5,bias=False)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)

def forward(self, x):
    out = self.conv1(x)
    out = F.relu(out)
    out = F.max_pool2d(out, 2)
    out = F.relu(self.conv2(out))
    out = F.max_pool2d(out, 2)
    out = out.view(out.size(0), -1)
    out = F.relu(self.fc1(out))
    out = F.relu(self.fc2(out))
    out = self.fc3(out)
    return out

model = LeNet().to(device=device)

if args.model:
if os.path.isfile(args.model):
print("=> loading checkpoint ‘{}’".format(args.model))
checkpoint = torch.load(args.model)
args.start_epoch = checkpoint[‘epoch’]
best_prec1 = checkpoint[‘best_prec1’]
model.load_state_dict(checkpoint[‘state_dict’])
print("=> loaded checkpoint ‘{}’ (epoch {}) Prec1: {:f}".format(args.model, checkpoint[‘epoch’], best_prec1))
else:
print("=> no checkpoint found at ‘{}’".format(args.resume))

print(‘Pre-processing Successful!’)

def test(model):
kwargs = {‘num_workers’: 1, ‘pin_memory’: True} if args.cuda else {}
test_loader = torch.utils.data.DataLoader(
datasets.MNIST(’./data’, train=True, download=True,
transform=transforms.Compose([
transforms.Resize((32, 32)),
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])),
batch_size=args.test_batch_size, shuffle=True, **kwargs)

getting this error: RuntimeError: Given groups=1, weight of size 3 6 5 5, expected input[256, 3, 14, 14] to have 6 channels, but got 3 channels instead
if i change bias = True then i am getting this error:
RuntimeError: Given weight of size 3 1 5 5, expected bias to be 1-dimensional with 3 elements, but got bias of size [6] instead

Your model seems to work fine with bias=False and bias=True:

class LeNet(nn.Module):
    def __init__(self):
        super(LeNet, self).__init__()
        self.conv1 = nn.Conv2d(1, 6, 5,bias=False)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)
    
    def forward(self, x):
        out = self.conv1(x)
        out = F.relu(out)
        out = F.max_pool2d(out, 2)
        out = F.relu(self.conv2(out))
        out = F.max_pool2d(out, 2)
        out = out.view(out.size(0), -1)
        out = F.relu(self.fc1(out))
        out = F.relu(self.fc2(out))
        out = self.fc3(out)
        return out

model = LeNet()
x = torch.randn(1, 1, 32, 32)
out = model(x)

If you are running the code in a notebook, make sure to initialize all cells before the actual model execution.

2 Likes

I also have similar problem. My input image is 1-channel but I got this

RuntimeError: Given groups=1, weight of size [60, 1, 3, 3], expected input[1, 2, 264, 184] to have 1 channels, but got 2 channels instead

I tried checking the model

x = torch.randn((1, 1, height, width))
x = model(x)
print(x.shape)
print(model)

I got this - torch.Size([1, 1, 1056, 736])

so pls I need help ASAP plsss

Actually I’m using a SWINIR model for superresolution but the model initially accepts RGB image but I’m using grayscale

Your input images have 2 channels, which is not a grayscale format, so you might need to revisit your data loading and processing pipeline making sure a single channel is used.

Thank you so much but I was unable to solve the error. I have searched every line in the code and do not seem to find where it converted my 1 channel data to 2 channels.

Also, I was trying to do transfer learning SWINIR. Like I said it was designed for 3 channels so I get size mismatch:

size mismatch for conv_first.weight: copying a param with shape torch.Size([180, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([180, 1, 3, 3]).
size mismatch for conv_last.weight: copying a param with shape torch.Size([3, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([1, 64, 3, 3]).
size mismatch for conv_last.bias: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([1]).

is it possible to remove this mismatch layers and train the model?

In case you are manipulating the model by replacing the first and last conv layers, load the state_dict beforehand, and replace the layers afterwards to avoid these shape mismatch errors.