ValueError: Expected input batch_size (16) to match target batch_size (32)

Deb_Prakash_Chatterj · June 8, 2019, 5:34pm

I am unexpectedly getting this error in my code -

     88     prediction = torch.argmax(p, dim=1)
     89     #loss = torch.nn.functional.nll_loss(torch.log(p), y)
---> 90     loss = criterion(output, labels)
     91     loss.backward()
     92     optimizer.step()

ValueError: Expected input batch_size (16) to match target batch_size (32).

Here is my code -

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        # convolutional layer
        self.conv1 = nn.Conv2d(3, 16, 3, padding=1)
        self.conv2 = nn.Conv2d(16, 32, 5, padding=2)
        self.conv3 = nn.Conv2d(32, 64, 5, padding=2)
        self.conv4 = nn.Conv2d(64, 128, 5, padding=2)
      
        # max pooling layer
        self.pool = nn.MaxPool2d(2, 2)
      
        self.fc1 = nn.Linear(50176, 25088)
        self.fc2 = nn.Linear(25088, 12544)
        self.fc3 = nn.Linear(12544, 4)
        
        # linear layer (64 * 4 * 4 -> 500)
        #self.fc1 = nn.Linear(64 * 4 * 4, 500)
        # linear layer (500 -> 10)
        #self.fc2 = nn.Linear(500, 10)
        
        #Apply Dropout to reduce Overfitting
        self.dropout = nn.Dropout(0.4)

    def forward(self, x):
        # add sequence of convolutional and max pooling layers
        #x = x.view(x.shape[0], -1)
                
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = self.pool(F.relu(self.conv3(x)))
        x = self.pool(F.relu(self.conv4(x)))
                
        #Flatten the Image
        x = x.view(-1, 224 * 224)
        
        # add dropout layer
        x = self.dropout(x)
        
        # add 1st hidden layer, with relu activation function
        x = F.relu(self.fc1(x))
        # add dropout layer
        x = self.dropout(x)
        # add 2nd hidden layer, with relu activation function
        x = F.relu(self.fc2(x))
        # add dropout layer
        x = self.dropout(x)
        # add 3rd hidden layer, with relu activation function
        x = self.fc3(x)
       
        
        # add 2nd hidden layer, with relu activation function
        #x = self.fc2(x)
        return x

import torch.optim as optim

# specify loss function
criterion = nn.CrossEntropyLoss()

# specify optimizer
optimizer = optim.Adam(model.parameters(), lr=0.0001)

# number of epochs to train the model
epochs = 30 # you may increase this number to train a final model

valid_loss_min = np.Inf # track change in validation loss

for epoch in range(1, epochs+1):
  
  running_loss = 0
  model.train()
  for images, labels in dataloader_train:
    
    #steps += 1
    images, labels = images.to(device), labels.to(device)
    
    
    optimizer.zero_grad()
    
    output = model.forward(images)
    conf_matrix = confusion_matrix(output, labels, conf_matrix)
    p = torch.nn.functional.softmax(output, dim=1)
    prediction = torch.argmax(p, dim=1)
    #loss = torch.nn.functional.nll_loss(torch.log(p), y)
    loss = criterion(output, labels)
    loss.backward()
    optimizer.step()
    
    train_loss += loss.item()*data.size(0)

I have tried some solutions, but many of them are not working and others I am not getting.
Can anyone help me please in this case? Thanks in advance.

GokulDAS027 · June 8, 2019, 6:21pm

The problem might be with the dataloader,
Both the target (labels) and the data(input to model) are having different batch sizes.

Also, the “flattening of image” may be doing something wrong of which I’m not sure, try tweaking it, and do seed back .

Deb_Prakash_Chatterj · June 8, 2019, 7:50pm

So, along with print the Image, I also printed the images size, it came out like this -

images, labels = next(iter(dataloader_train))
imshow_numpy(images[0].numpy())
print(images.shape)

torch.Size([32, 3, 224, 224])  # I think it is (first one don't know, Color Channels, H, W)

In the forward function, I am using this -

#Flatten the Image
x = x.view(-1, 224 * 224)

How can I check if both the target and images are of different batch_sizes?
Also, I have a concern, that my first input in linear layer in CNN is not right -

self.fc1 = nn.Linear(50176, 25088)
self.fc2 = nn.Linear(25088, 12544)
self.fc3 = nn.Linear(12544, 4)

How can I ensure that? Is my point even valid? Thanks.