RuntimeError: mat1 and mat2 shapes cannot be multiplied (64x13056 and 153600x2048)

Your code generally works:

num_classes = 2
model = CNN()

x = torch.randn(2, 3, 192, 192)
out = model(x)
print(out.shape)
# torch.Size([2, 2])

and it seems the shape mismatch is raised in the loss calculation so check the shape of the model output as well as the target.

loss = loss_func(output, b_y)

As @ptrblck has mentioned, check the shapes using print statements.

And why did you index with [0] here:

The same training code works on 1X28X28 MNIST data. But when I am trying to run on 3X192X192 its showing me error

1 Like

    Layer (type)               Output Shape         Param #

================================================================
Conv2d-1 [-1, 32, 192, 192] 896
BatchNorm2d-2 [-1, 32, 192, 192] 64
ReLU-3 [-1, 32, 192, 192] 0
MaxPool2d-4 [-1, 32, 96, 96] 0
Conv2d-5 [-1, 64, 96, 96] 18,496
BatchNorm2d-6 [-1, 64, 96, 96] 128
ReLU-7 [-1, 64, 96, 96] 0
MaxPool2d-8 [-1, 64, 48, 48] 0
Conv2d-9 [-1, 128, 48, 48] 73,856
BatchNorm2d-10 [-1, 128, 48, 48] 256
ReLU-11 [-1, 128, 48, 48] 0
MaxPool2d-12 [-1, 128, 24, 24] 0
Conv2d-13 [-1, 256, 24, 24] 295,168
BatchNorm2d-14 [-1, 256, 24, 24] 512
ReLU-15 [-1, 256, 24, 24] 0
MaxPool2d-16 [-1, 256, 12, 12] 0
Conv2d-17 [-1, 512, 12, 12] 1,180,160
BatchNorm2d-18 [-1, 512, 12, 12] 1,024
ReLU-19 [-1, 512, 12, 12] 0
MaxPool2d-20 [-1, 512, 6, 6] 0
Linear-21 [-1, 256] 4,718,848
ReLU-22 [-1, 256] 0
BatchNorm1d-23 [-1, 256] 512
Dropout-24 [-1, 256] 0
Linear-25 [-1, 2] 514

Total params: 6,290,434
Trainable params: 6,290,434
Non-trainable params: 0

Input size (MB): 0.42
Forward/backward pass size (MB): 56.68
Params size (MB): 24.00
Estimated Total Size (MB): 81.10

Replace this with the code below -

print(output.shape)
print(b_y.shape)
1 Like

torch.Size([2])
torch.Size([128])

torch.Size([2])
torch.Size([128]).

train_losses = []
val_losses = []
test_losses =[]
train_auc = []
val_auc = []
train_auc_epoch = []
val_auc_epoch = []
best_auc = 0.0
min_loss = np.Inf

since = time.time()

for e in range(num_epochs):

train_loss = 0.0
val_loss = 0.0

# Train the model
model.train()
for i, (images, labels) in enumerate(tqdm(train_dataloader, total=int(len(train_dataloader)))):
    images = images.to(device)
    labels = labels.to(device)
    images, labels = images.cuda(), labels.cuda()
    # Forward pass
    outputs = model(images)
    loss = criterion(outputs, labels)
    
    # Backward and optimize
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
    # Loss and accuracy
    train_loss += loss.item()
    y_actual = labels.data.cpu().numpy()
    y_pred = outputs[:,-1].detach().cpu().numpy()
    train_auc.append(roc_auc_score(y_actual, y_pred))


# Evaluate the model
model.eval()
for i, (images, labels) in enumerate(tqdm(val_dataloader, total=int(len(val_dataloader)))):
    images = images.to(device)
    labels = labels.to(device)
    
    # Forward pass
    outputs = model(images)
    loss = criterion(outputs, labels)
    
    # Loss and AUC
    val_loss += loss.item()
    y_actual = labels.data.cpu().numpy()
    y_pred = outputs[:,-1].detach().cpu().numpy()
    val_auc.append(roc_auc_score(y_actual, y_pred))
    



# Average losses and accuracies
train_loss = train_loss/len(train_dataloader)
val_loss = val_loss/len(val_dataloader)
train_losses.append(train_loss)
val_losses.append(val_loss)
training_auc = np.mean(train_auc)
validation_auc = np.mean(val_auc)
train_auc_epoch.append(training_auc)
val_auc_epoch.append(validation_auc)

# Updating best validation AUC
if best_auc < val_auc:
    best_acc = val_auc


# Saving best model
if min_loss >= val_loss:
    torch.save(model.state_dict(), 'best_model.pt')
    min_loss = val_loss

print('EPOCH {}/{}'.format(e+1, num_epochs))
print('-' * 10)
print("Train loss: {:.6f}, Train AUC: {:.4f}".format(train_loss, train_auc))
print("Validation loss: {:.6f}, Validation AUC: {:.4f}\n".format(val_loss, val_auc))

time_elapsed = time.time() - since
print(‘Training completed in {:.0f}m {:.0f}s’.format(time_elapsed // 60, time_elapsed % 60))
print(‘Best validation AUC: {:4f}’.format(best_auc))

100%
2048/2048 [19:19<00:00, 1.77it/s]
6%
16/256 [00:08<01:52, 2.13it/s]

ValueError Traceback (most recent call last)
in
52 y_actual = labels.data.cpu().numpy()
53 y_pred = outputs[:,-1].detach().cpu().numpy()
—> 54 val_auc.append(roc_auc_score(y_actual, y_pred))
55
56

2 frames
/usr/local/lib/python3.7/dist-packages/sklearn/metrics/_ranking.py in _binary_roc_auc_score(y_true, y_score, sample_weight, max_fpr)
336 if len(np.unique(y_true)) != 2:
337 raise ValueError(
→ 338 "Only one class present in y_true. ROC AUC score "
339 “is not defined in that case.”
340 )

ValueError: Only one class present in y_true. ROC AUC score is not defined in that case.

Hi,
Can someone help me to get rid of this error:
RuntimeError: mat1 and mat2 shapes cannot be multiplied (96x4096 and 12288x200)

class Net(nn.Module):
    def __init__(self, ):
        super(Net, self).__init__()
        # two hidden later with 200 and 50 
        # input size is 64 * 64 
        self.fc1 = nn.Linear( 3 * IMAGE_SIZE * IMAGE_SIZE, 200)
        self.fc2 = nn.Linear(200, 50)
        self.fc3 = nn.Linear(50, 3) # output with NUM_CLASSES classes 
        
    def forward(self, x):
        # using relu activation function for each layer except output layer because for output we are using cross entropy loss
        print('x.shape', x.shape)
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = self.fc3(x)
        # no activation is needed at the end
        return x

@ptrblck , can you pleae suggest ?

Based on your code it seems your input has a shape of [batch_size, 64*64=4096]. In this case the in_features of the first linear layer is expected to be set to 4096 too while it is set to a different value and raises the error.

Can anyone help me, please?

I’m having this issue with my chatbot:

You can see the code here:
https://github.com/GusdPaula/chatbot_pytorch

The error points towards a shape mismatch in a matmul operation (so I guess it’s an nn.Linear layer). Make sure the feature dimension of the input to the linear layer matches the specified in_features. Currently your input tensor seems to have a shape pf [8, 8] while the linear layer expects 53 input features.

Thank you a lot. It worked!! :slight_smile:

Hello, hope you are well!
Can someone help me to solve this error:

RuntimeError: mat1 and mat2 shapes cannot be multiplied (64x6 and 256x2)

I am trying to create an ensemble model classification for 2 classes - my models are vgg19, resnet50 and densenet161. My input image is resized to 224x224.

class EnsembleModel(nn.Module):   
    def __init__(self, modelA, modelB, modelC):
        super().__init__()
        self.modelA = modelA
        self.modelB = modelB
        self.modelC = modelC
        self.classifier = nn.Linear(256, 2)
        
        
    def forward(self, x):
        x1 = self.modelA(x)
        x2 = self.modelB(x)
        x3 = self.modelC(x)
        x = torch.cat((x1, x2, x3), dim=1)
        out = self.classifier(x)
        return out
    
ensemble_model = EnsembleModel(model_densenet161, model_resnet50, model_vgg19_bn)

for param in ensemble_model.parameters():
    param.requires_grad = False

for param in ensemble_model.classifier.parameters():
    param.requires_grad = True    

ensemble_model = ensemble_model.to(DEVICE)

The input activation x to the self.classifier seems to contain 6 features based on the error message while 256 are expected.
You could change the in_features argument of self.classifier to 6 and it should work.
However, based on your description it seems you are trying to use the final logits of all models to create an ensemble so you might also want to consider using the penultimate feature output.

1 Like

Yes! Thank you! You are right, I ran the ensemble and worked, but it gives me way to low validation accuracy values (20-30% - while the single models are like 80-85%). Considering the penultimate feature output, do you mean something like this, changing the classifier?

self.classifier = nn.Sequential(
            nn.Linear(512, 6), nn.ReLU(), nn.Dropout(0.2),
            nn.Linear(6, 2), nn.LogSoftmax(dim=1))

Hi,

I am also getting same error… it works fine when batch_size=1 in train_loader but with batch of 10 I get this:

class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=10, kernel_size=3, bias=False) 
        self.conv2 = nn.Conv2d(in_channels=10, out_channels=10, kernel_size=3, bias=False)
        self.fc1 = nn.Linear(in_features=10 * 24 * 24, out_features=10, bias=False)
        self.fc2 = nn.Linear(in_features=10, out_features=10, bias=False)
        self.out = nn.Linear(in_features=10, out_features=10, bias=False)
    
    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(x)
        x = self.conv2(x)
        x = F.relu(x)
        x = x.reshape(1,-1)
        x = self.fc1(x)
        x = F.relu(x)
        x = self.fc2(x)
        x = F.relu(x)
        x = self.out(x)
        x = F.softmax(x, dim=1)
        return x

RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x57600 and 5760x10)

Since I have set bias=False as well… so calculations here must be straight forward. Still I do not get as to why the line x = self.fc1(x) shows error.

Could you please explain the flow, what exactly am I missing??

train_set = torchvision.datasets.FashionMNIST(
    root="./data",
    train=True,
    download=True,
    transform=transforms.Compose([
        transforms.ToTensor()
    ])
)


train_loader = torch.utils.data.DataLoader(
    train_set,
    batch_size=10,
    shuffle=True
)

batch = next(iter(train_loader))
images, labels = batch

preds = net(images)

Thanks

Model ensembles often use the features of pretrained models instead of the final classifier output, so yes you could check the classifier and drop the final output layer(s).

@Manu-Chauhan
This line of code is wrong:

x = x.reshape(1,-1)

as it hard-codes the batch size to 1.
Use x = x.view(x.size(0), -1) and it should work.

1 Like

I’m running DQN class for my classification model. Below is the snapshot of my layer. I got the RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x31 and 30x16) . Can anyone help with it?

‘’’
class DQN(nn.Module):

def __init__(self):
    super().__init__()
    self.fc1 = nn.Linear(30, 16)
    self.fc2 = nn.Linear(16, 18)
    self.fc3 = nn.Linear(18, 20)
    self.fc4 = nn.Linear(20, 24)
    self.fc5 = nn.Linear(24, 2)

def forward(self, x):
    x = F.relu(self.fc1(x))
    x = F.relu(self.fc2(x))
    x = F.dropout(x, p=0.25)
    x = F.relu(self.fc3(x))
    x = F.relu(self.fc4(x))
    x = torch.sigmoid(self.fc5(x))
    return x