ValueError: Expected input batch_size (324) to match target batch_size (4)

Loop Train size torch.Size([64, 1, 28, 28])
Loop label Size torch.Size([64])
CNN_1 : torch.Size([64, 32, 28, 28])
MAXPOOL_1 : torch.Size([64, 32, 14, 14])
CNN_2 : torch.Size([64, 64, 14, 14])
MAXPOOL : torch.Size([64, 64, 7, 7])
FC: torch.Size([64, 3136])
FINAL : torch.Size([64, 10])

The code runs fine for first 500 iterations:
Iteration :500 Loss :2.2414138317108154 Accuracy :10
But again given the above mentioned error

The above error is sorted as the target given for trainning was wrong.

Hello, I am trying to modify the spatial transformer network code for input images of size (4, 256, 256).

Here is the code of the model :

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(4, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        # self.fc1 = nn.Linear(320, 50)
        self.fc1 = nn.Linear(20*61*61, 50) # hard coded, 74420 = np.prod(x.shape[1:]) from forward
        self.fc2 = nn.Linear(50, 10)

        # Spatial transformer localization-network
        self.localization = nn.Sequential(
            nn.Conv2d(4, 8, kernel_size=7),
            nn.MaxPool2d(2, stride=2),
            nn.ReLU(True),
            nn.Conv2d(8, 10, kernel_size=5),
            nn.MaxPool2d(2, stride=2),
            nn.ReLU(True)
        )

        # Regressor for the 3 * 2 affine matrix
        self.fc_loc = nn.Sequential(
            # nn.Linear(10 * 3 * 3, 32), # à changer 10*3*3
            nn.Linear(10 * 60 * 60, 32), # custom
            nn.ReLU(True),
            nn.Linear(32, 3 * 2)
        )

        # Initialize the weights/bias with identity transformation
        self.fc_loc[2].weight.data.zero_()
        self.fc_loc[2].bias.data.copy_(torch.tensor([1, 0, 0, 0, 1, 0], dtype=torch.float))

    # Spatial transformer network forward function
    def stn(self, x):
        xs = self.localization(x)
        print("End of localization (in STN):", xs.shape)
        # xs = xs.view(-1, 10 * 3 * 3) # à changer pour une autre taille image 
        # (2, 90)
        xs = xs.view(-1, 10 * 60 * 60) # custom 
        theta = self.fc_loc(xs)
        print('During STN xs :', xs.shape)
        theta = theta.view(-1, 2, 3) # shape : (2, 2, 3)
        print("During STN theta :", theta.shape,'\n')

        grid = F.affine_grid(theta, x.size())
        x = F.grid_sample(x, grid)

        return x

    def forward(self, x):
        # transform the input
        x = self.stn(x) # shape (2, 10, 3, 3)
        print('End STN :', x.shape)

        # Perform the usual forward pass
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        print('End conv1 :', x.shape)
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        print("End conv2_drop : ", x.shape)
        # x = x.view(-1, 320)
        # x = x.view(-1, np.prod(x.shape[1:])) # custom 
        x = x.view(x.size(0), -1)
        print(x.shape)
        x = F.relu(self.fc1(x))
        print('Fully connected 1 :', x.shape)
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        print('Fully connected 2 :', x.shape)
        return F.log_softmax(x, dim=1)

When I print the summary with torchsummary, I have good dimensions, but I have the same error as everyone else. Could anyone help me debug ? I can’t find where it’s wrong.

Hello i’m new on machine learning i want to train unet model on my data set but i’m getting the above mentioned error. could you please help me to fix it.
My loss Function
def cross_entropy2d(input, target, weight=None, size_average=True):

target = target.squeeze(1)

n, c, h, w = input.size()
nt, ct, ht, wt = target.size()
# Handle inconsistent size between input and target
if h != ht and w != wt:  # upsample labels
    input = F.interpolate(input, size=(ht, wt), mode="bilinear", align_corners=True)

input = input.transpose(1, 2).transpose(2, 3).contiguous().view(-1, c)
target = target.view(-1)
loss = F.cross_entropy(
    input, target, weight=weight, size_average=size_average, ignore_index=250
)
return loss

My Unet Model

import torch.nn as nn

from ptsemseg.models.utils import unetConv2, unetUp

class unet(nn.Module):
def init(
self, feature_scale=4, n_classes=3, is_deconv=True, in_channels=3, is_batchnorm=True
):
super(unet, self).init()
self.is_deconv = is_deconv
self.in_channels = in_channels
self.is_batchnorm = is_batchnorm
self.feature_scale = feature_scale

    filters = [64, 128, 256, 512, 1024]
    filters = [int(x / self.feature_scale) for x in filters]

    # downsampling
    self.conv1 = unetConv2(self.in_channels, filters[0], self.is_batchnorm)
    self.maxpool1 = nn.MaxPool2d(kernel_size=2)

    self.conv2 = unetConv2(filters[0], filters[1], self.is_batchnorm)
    self.maxpool2 = nn.MaxPool2d(kernel_size=2)

    self.conv3 = unetConv2(filters[1], filters[2], self.is_batchnorm)
    self.maxpool3 = nn.MaxPool2d(kernel_size=2)

    self.conv4 = unetConv2(filters[2], filters[3], self.is_batchnorm)
    self.maxpool4 = nn.MaxPool2d(kernel_size=2)

    self.center = unetConv2(filters[3], filters[4], self.is_batchnorm)

    # upsampling
    self.up_concat4 = unetUp(filters[4], filters[3], self.is_deconv)
    self.up_concat3 = unetUp(filters[3], filters[2], self.is_deconv)
    self.up_concat2 = unetUp(filters[2], filters[1], self.is_deconv)
    self.up_concat1 = unetUp(filters[1], filters[0], self.is_deconv)

    # final conv (without any concat)
    self.final = nn.Conv2d(filters[0], n_classes, 1)

def forward(self, inputs):
    conv1 = self.conv1(inputs)
    maxpool1 = self.maxpool1(conv1)

    conv2 = self.conv2(maxpool1)
    maxpool2 = self.maxpool2(conv2)

    conv3 = self.conv3(maxpool2)
    maxpool3 = self.maxpool3(conv3)

    conv4 = self.conv4(maxpool3)
    maxpool4 = self.maxpool4(conv4)

    center = self.center(maxpool4)
    up4 = self.up_concat4(conv4, center)
    up3 = self.up_concat3(conv3, up4)
    up2 = self.up_concat2(conv2, up3)
    up1 = self.up_concat1(conv1, up2)

    final = self.final(up1)

    return final

when i print the input.size in loss function i’will get following result
image

please help me

Based on your code, it looks like you are working on a pixel-wise classification (e.g. segmentation use case).
Also, it looks like you would like to push all pixels into the batch dimension and just flatten the target.
This should generally work with nn.CrossEntropyLoss, although the reshaping is not necessary, as you can work with spatial outputs and targets.

However, the target shape looks wrong and it seems you might be using a one-hot encoded target.
For a segmentation use case with nn.CrossEntropyLoss, the model output should have the shape [batch_size, nb_classes, height, width], while the target should be a LongTensor with the shape [batch_size, height, width] containing the class indices in the range [0, nb_classes-1].
If you have the one-hot encoding in dim3 in your target tensor, you could simply call:

target = torch.argmax(target, 3)

to get the class indices.

On the other hand, if you are dealing with a multi-label classification, i.e. each pixel can belong to more than a single class, you should permute the target to have the same shape as your model output, and use nn.BCEWithLogitsLoss instead.

2 Likes

Thanks. It really helps.

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
import torch
import os
import cv2
import matplotlib.pyplot as plt
execution_path=os.getcwd()
filenames=[“no”,“yes”]
training_data=[]
data_set=[]
data_label=[]
execution_path=os.getcwd()
dirname=’/kaggle/input/brain-mri-images-for-brain-tumor-detection/brain_tumor_dataset/’

for filename in filenames:
#print(os.path.join(dirname, filename))
path=os.path.join(dirname,filename)
filename_index=filenames.index(filename)
for image in os.listdir(path):
try:
img_array = cv2.imread(os.path.join(path, image), cv2.IMREAD_GRAYSCALE)
img_array = img_array.astype(np.float32)
img_array = cv2.resize(img_array, (64, 64))
#img = Image.open(self.paths[index]).convert(‘RGB’)
training_data.append([img_array, filename_index])
except Exception as e:
print(e)
np.random.shuffle(training_data)
for feature, label in training_data:
data_set.append(feature)
data_label.append(label)
x_train,y_train,x_test,y_test=train_test_split(data_set, data_label, test_size = 0.1,
random_state = 45)
x_train = np.array(x_train).reshape(-1, 1, 64, 64)
y_train=np.array(y_train).reshape(-1,1,64,64)
x_trainn=torch.from_numpy(x_train)
y_trainn=torch.from_numpy(y_train)

import torch.nn as nn
import torch

class CNN(nn.Module):
def init(self):
super(CNN,self).init()
self.layer1=nn.Sequential(nn.Conv2d(1,16,kernel_size=3,stride=1,padding=1),nn.ReLU(),nn.MaxPool2d(kernel_size=2,stride=2),nn.Dropout(0.5))
self.layer2=nn.Sequential(nn.Conv2d(16,32,kernel_size=3,stride=1,padding=1),nn.ReLU(),nn.MaxPool2d(kernel_size=2,stride=2),nn.Dropout(0.5))
self.layer3=nn.Sequential(nn.Conv2d(32,64,kernel_size=3,stride=1,padding=1),nn.ReLU(),nn.MaxPool2d(kernel_size=2,stride=2,padding=1),nn.Dropout(0.5))
self.fc1=nn.Linear(5184,2,bias=True)

    #self.layer4=nn.Sequential(self.fc1,nn.ReLU(),nn.Dropout(0.5))
    #self.fc2=nn.Linear(630,5,bias=True)
def forward(self,X):
    out=self.layer1(X)
    out=self.layer2(out)
    out=self.layer3(out)
    out=out.view(out.size(0),-1)
    out=self.fc1(out)
    #out=self.fc2(out)
    return out

model=CNN()
print(model)

criterion=nn.CrossEntropyLoss()
optimizer=torch.optim.Adam(model.parameters(),lr=0.01)

y_pred=model(x_trainn)
loss=criterion(y_pred,y_trainn)
optimizer.zero_grad()
loss.backward()
optimizer.step()

P.S. Expected input batch_size (227) to match target batch_size (26).

Please tell me how to remove the error?

@ptrblck sir plzz reply

Hey there,
I have a similar issue and cannot find the solution:

# tested with python 3.7.5
from torch.utils.data import Dataset
import ast
import torch
import torch.nn as nn
from torch.autograd import Variable

# Links:
# https://www.kaggle.com/danieldagnino/training-a-classifier-with-pytorch

class my_model(nn.Module):
    def __init__(self,n_in=60, n_hidden=10, n_out=60):
        super(my_model,self).__init__()
        self.n_in  = n_in
        self.n_out = n_out

        self.linearlinear = nn.Sequential(
            nn.Linear(self.n_in,  self.n_out, bias=True),   # Hidden layer.
            )
        self.logprob = nn.LogSoftmax(dim=1)                 # -Log(Softmax probability).

    def forward(self,x):
        x = self.linearlinear(x)
        print("In forward:", x.shape)
        x = self.logprob(x)
        print("In forward:", x.shape)
        return x

class TESNamesDataset(Dataset):
    def __init__(self, data_root):
        self.samples = []
        with open(data_root,"r") as f:
            self.samples = [ [ast.literal_eval(ast.literal_eval(elem)[0]), ast.literal_eval(ast.literal_eval(elem)[1])] for elem in f.read().split('\n') if elem]

    def __len__(self):
        return len(self.samples)

    def __getitem__(self, idx):
        # Function that returns one input and one output (label)
        return torch.Tensor(self.samples[idx][0]), torch.Tensor(self.samples[idx][1])

if __name__ == '__main__':
    from torch.utils.data import DataLoader
    my_data = TESNamesDataset('first_move.txt')
    my_loader = DataLoader(my_data, batch_size=1, num_workers=0)

    model = my_model()
    criterium = nn.NLLLoss()
    optimizer = torch.optim.Adam(model.parameters(),lr=0.1,weight_decay=1e-4)

    # Taining.
    for k, (data, target) in enumerate(my_loader):
        print("\nEpoch:", k)
        print(data)
        print(target)
        #print(data.view(-1).shape) # 1x60
        data   = Variable(data,         requires_grad=False) # input
        target = Variable(target.long(), requires_grad=False) # output

        # Set gradient to 0.
        optimizer.zero_grad()
        # Feed forward.
        pred = model(data)

        print("Pred.shape:", pred.shape)
        print("target.shape:", target.shape)

        print(target.view(-1).shape)
        #print(eee)

        loss = criterium(pred, target.view(-1))

        # Gradient calculation.
        loss.backward()

        # Print loss every 10 iterations.
        if k%10==0:
            print('Loss {:.4f} at iter {:d}'.format(loss.item(),k))

        # Model weight modification based on the optimizer.
        optimizer.step()

Output:

Epoch: 0
tensor([[0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0.,
         0., 1., 0., 0., 0., 0., 0., 0., 1., 0., 1., 0., 0., 0., 0., 0., 1., 0.,
         0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 1., 1., 1., 0., 0., 0., 1., 1.,
         1., 0., 1., 0., 1., 0.]])
tensor([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 1., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 0., 0., 0., 0.]])
In forward: torch.Size([1, 60])
In forward: torch.Size([1, 60])
Pred.shape: torch.Size([1, 60])
target.shape: torch.Size([1, 60])
torch.Size([60])

Any Idea why I get this error:
ValueError: Expected input batch_size (1) to match target batch_size (60).

Solution:
I found a solution, my labels were binary 60x1 which is also how I wanted them to be! But with this I always got the above error. For this reason I made the binary to an int and then it worked. I do not know however if I still need the 60 outputs now, actually I just need one number (showing the binary index).

@ptrblck

Define Autoencoder

#dropout_p= 0.5
class Autoencoder(nn.Module):

def __init__(self):
    super(Autoencoder, self).__init__()
    self.encoder = nn.Sequential(
       # nn.Dropout2d(dropout_p= 0.5),
        nn.Conv2d(1, 16,kernel_size=3,stride =1, padding= 1,bias= False),  
        nn.ReLU(True),
        nn.MaxPool2d( 2, 2),  
        nn.Conv2d(16,32,kernel_size=3,stride =1, padding=1),  
        nn.ReLU(True),
        nn.MaxPool2d(2, 2),  
        nn.Conv2d(32,64,kernel_size=3,stride =1, padding= 1),  
        nn.ReLU(True),
        nn.MaxPool2d(2, 2),  
        
    )
    self.decoder = nn.Sequential(
        nn.MaxUnpool2d(2, 2),
        nn.ConvTranspose2d(64, 32, kernel_size=3), 
        nn.ReLU(True),
        nn.MaxUnpool2d(2, 2),
        nn.ConvTranspose2d(32, 16, kernel_size=3),  
        nn.ReLU(True),
        nn.MaxUnpool2d(2, 2),
        nn.ConvTranspose2d(16, 8, kernel_size=3)
        #nn.Tanh()
    )

def forward(self, x):
    x = self.encoder(x)
    encoded_x = x
    x = self.decoder(x)
    print(x.shape)
    return x, encoded_x

num_epochs = 5
learning_rate = 1e-5
model= Autoencoder()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
for epoch in range(num_epochs):
for data in loader_train_set:
img, label = data
img = img.view(-1,1,320,240)
img = Variable(img)
label = Variable(label)
encoded_output = model.encoder(img)
loss = criterion(encoded_output, label)
optimizer.zero_grad()
loss.backward()
optimizer.step()

print(‘epoch [{}/{}], loss:{:.4f}’.format(epoch+1, num_epochs, loss.data()))

ValueError: Expected input batch_size (60) to match target batch_size (20).
I tried solving using above provided solutions, but didn’t worked for my code. Need help.

Could you post the shapes of img, encoded_output, and label?

Also, Variables are deprecated since PyTorch 0.4.0, so you can just use tensors now. :wink:

@CesMak
nn.NLLLoss expects a target as a LongTensor containing the class indices in the range [0, nb_classes-1], while your target seems to be one-hot encoded.
You could try to create the expected target via target = torch.argmax(target, 1).

Thank you so much!
shapes of img , encoded_output , and label are as below!
Label = torch.Size([20])
img = torch.Size([60, 1, 320, 240])
encoded_output= torch.Size([60, 64, 40, 30])

Your encoded_output shape looks as if it contains logits for a multi-class segmentation, while the targets are for a multi-class classification with another batch size.

By best guess at the moment is that this call img = img.view(-1,1,320,240) creates a new batch size.
What is the shape of img before the view op?

Its is torch.Size([20, 3, 240, 320])

Yes, you are right. View operation is making the new batch size, but I used it because of the channel mismatch issue, I am using the grey scale image and before the view operation the size of the img is [20,3,240,340], however it should be [20,1,240,340].

You can’t use view in this case.
If you are sure you are dealing with a grayscale image, you could either load the image in the grayscale format from the beginning or slice the tensor via img = img[:, 0], since all channels should contain the same value.

img, label = data
print(img.size())
img = torch.tensor(img)
img = img[:, 0]
encoded_output = model.encoder(img)
is giving me this error
RuntimeError: Expected 4-dimensional input for 4-dimensional weight 16 1 3 3, but got 3-dimensional input of size [20, 240, 320] instead

Sorry, my bad. Unsqueeze img at dim1 or use img = img[:, 0:1] :wink:

Apologies, for throwing out so many questions, but I am trying quite a long time and I am just stuck.
Got below error now.
RuntimeError: only batches of spatial targets supported (3D tensors) but got targets of dimension: 1