ValueError: Expected input batch_size (1) to match target batch_size (4). How can I fix this?

junyoung · May 4, 2024, 1:30pm

I have ValueError: Expected input batch_size (1) to match target batch_size (4). in my training code “loss = criterion(outputs, labels)” where I calculate the loss. I use CIFAR10 dataset and load the data with batch_size=4, with shape
[Data Shape]
input shape: torch.Size([4, 3, 32, 32])
label shape: torch.Size([4])
but my model output’s batch_size is 1 and shape is like
[Output Shape of model]
output shape: torch.Size([1, 10])
I think the output shape should be like torch.Size([4, 10]) to solve this error but I don’t know what is wrong with my model. I am curious about which part is wrong and how to fix this. Please let me know.

Here is my code

import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
import torch.nn.functional as F
print (torch.__version__)

transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)
testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=2)
classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()

        # Layer#1
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3)
        self.conv2 = nn.Conv2d(in_channels=64, out_channels=64, kernel_size=3)

        # Layer#2
        self.conv3 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3)

        # Layer#3
        self.conv4 = nn.Conv2d(in_channels=128, out_channels=256, kernel_size=3)

        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.dropout = nn.Dropout(0.5)

        # Layer#4
        self.fc1 = nn.Linear(4*4*256, 512)
        # Layer#5
        self.fc2 = nn.Linear(512, 512)
        # Layer#6
        self.fc3 = nn.Linear(512, 10)



    def forward(self, x):

        # Layer#1
        x = self.conv1(x)
        x = F.relu(x)
        x = self.dropout(x)
        x = self.conv2(x)
        x = F.relu(x)
        x = self.dropout(x)
        x = self.pool(x)

        # Layer#2
        x = self.conv3(x)
        x = F.relu(x)
        x = self.dropout(x)
        x = self.pool(x)

        # Layer#3
        x = self.conv4(x)
        x = F.relu(x)
        x = self.dropout(x)
        x = self.pool(x)

        # Layer#4
        x = x.view(-1, 4*4*256)
        x = self.fc1(x)
        x = F.relu(x)

        # Layer#5
        x = self.fc2(x)
        x = F.relu(x)

        # Layer#6
        x = self.fc3(x)
        x = F.softmax(x)

        return x

net = Net()

import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
for epoch in range(10):  

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs
        inputs, labels = data

        #Check
        print(f"input shape: {inputs.shape}")
        print(f"label shape: {labels.shape}")

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        print(f"output shape: {outputs.shape}")
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0
            break

print('Finished Training')

I think “x = x.view(-1, 44256)” part has some problem but I don’t know how to fix this.

This is the model that I want to make.

ptrblck · May 4, 2024, 2:50pm

Will flatten the output and is most likely changing the batch size. Use x = x.view(x.size(0), -1) and fix potential shape mismatches in the next linear layer by adapting its in_features.
Also, remove F.softmax since nn.CrossEntropyLoss expects raw logits.