Batch size match

nfreitas · August 30, 2023, 7:15pm

I’m trying to create a CNN to do image recognition, but my training loop is failing in the loss check with the error “Expected input batch_size (1) to match target batch_size (50)”. I see many examples in the internt like mine and a look for solution here and in other pages but could’t figure out how to solve it.
Mainly I got that the output of the model has batch size 1 while the target has batch size 50 as I defined in the dataloader. That’s the code:

train_dl = torch.utils.data.DataLoader(dataset = train_ds,
                                       batch_size = 50,
                                      shuffle = True) 


test_dl = torch.utils.data.DataLoader(dataset = test_ds,
                                       batch_size = 50,
                                      shuffle = False)

class ConvNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(50, 5, 3, bias=False)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(5, 1, 3, bias=False)
        self.fc1 = nn.Linear(25, 10)
        
    def forward(self, x):
        #N 50 28 28
        x = F.relu(self.conv1(x)) # N 5 26 26
        x = self.pool(x)          # N 5 13 13
        x = F.relu(self.conv2(x)) # N 1 11 11
        x = self.pool(x)          # N 1 5 5
        x = torch.flatten(x, 1)   # N 25
        x = self.fc1(x)           # N 10
        
        return x

CNN = ConvNet()

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(CNN.parameters(), lr=0.001)

n_total_steps = len(train_dl)

for epoch in range(100):
    
    running_loss = 0.0
    
    for i, (images,labels) in enumerate(train_dl):
        images = images.float()
        labels = labels

        outputs = CNN(images)
        loss = criterion(outputs, labels)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

    print(f'[{epoch + 1}] loss: {running_loss / n_total_steps:.3f}')

print('Finished Training')

Thanks a lot in advance!!

ptrblck · August 30, 2023, 7:56pm

Your code works fine using:

train_ds = TensorDataset(torch.randn(101, 50, 28, 28), torch.randint(0, 10, (101,)))

train_dl = torch.utils.data.DataLoader(dataset = train_ds,
                                       batch_size = 50,
                                      shuffle = True)

which creates a last batch with a single sample.

nfreitas · August 31, 2023, 7:27am

thanks a lot! So the issue is on my Dataset!
I’m using a dataset in csv format and converting it to a tensor, so I suppose the issue is on my Dataset Class.

That’s what I’m doing:

data = pd.read_csv('train.csv')

X1 = data.drop(['label'], axis=1)
y = data[['label']]

X1_train, X1_test, y1_train, y1_test = train_test_split(X1, y, test_size=0.25, stratify=y, random_state=108)

y1_train.reset_index(inplace=True)
y1_train = y1_train['label']

class CustDF(torch.utils.data.Dataset):
    def __init__(self, features, target):
        self.features = features
        self.target = target
    
    def __len__(self):
        return self.features.shape[0]
    
    def __getitem__(self, idx):
        dp = self.features.iloc[idx].values
        targ = self.target[idx]
        dp = torch.from_numpy(dp)
        dp = dp.view(28,28)
        targ = torch.tensor(targ)
        return dp, targ

train_ds = CustDF(X1_train, y1_train)
test_ds = CustDF(X1_test, y1_test)

Still not getting how to transform it so it works as in your example. I’m using this Kaggle Dataset Digit Recognizer | Kaggle.

Thanks a lot ptrblck!

ptrblck · August 31, 2023, 1:48pm

Yes, the issue might come from your custom CustDF dataset. Print the shapes of the data and target inside the __getitem__ before returning them to see what’s happening and where the shape mismatch comes from.

nfreitas · September 1, 2023, 6:58am

ptrblck · September 1, 2023, 3:54pm

Did you also check the shape of the target as suggested?

nfreitas · September 5, 2023, 7:22pm

Hi @ptrblck!
I checked but I found no errors!

I’m still trying to figure out what I’m doing wrong!
The dataset is composed of 784 features and one target. the 784 features are pixels from images of 28x28.

I transformed the tensor using the .view so it has 28x28 as expected instead of 1x784.

ptrblck · September 5, 2023, 8:47pm

Your screenshot still doesn’t show the shape of the targ tensor, but just of dp before and after the view operation.

nfreitas · September 5, 2023, 9:11pm

like this you mean?

I was sure the issue was on the dp. But now I saw this strange outcome. as if the tensor was empty. but it’s not. I can’t add another print to this post, but this tensor exists and is not empty

nfreitas · September 5, 2023, 9:20pm

this tensor(8) is the one that is returning torch.Size([]). I have to sleep now. I’ll check it later. Tks a lot.

nfreitas · September 9, 2023, 2:50pm

Issue solved my dear @ptrblck!
The problem was on the CustDF class as expected.

The CustDF that makes it works is this:

class CustDF(torch.utils.data.Dataset):
    def __init__(self, features, target):
        self.features = features
        self.target = target
    
    def __len__(self):
        return self.features.shape[0]
    
    def __getitem__(self, idx):
        dp = self.features.iloc[idx].values
        targ = self.target[idx]
        dp = torch.from_numpy(dp)
        dp = dp.view(1,28,28)
        targ = torch.tensor(targ)
        return dp, targ

the view must be (1,28,28) instaed of just (28,28). This is because conv2d expects [50, 1, 28, 28] in this case. (50=batch size, 1=number of channels which in this case is 1 B&W imabe and the 28x28 from the image size)

Thaks a lot for your help and patience. It’s amazing to a newbie to have help from this community.