ValueError: Expected input batch_size (324) to match target batch_size (4)

Abhishek_Kumar2 · September 23, 2019, 8:20am

Loop Train size torch.Size([64, 1, 28, 28])
Loop label Size torch.Size([64])
CNN_1 : torch.Size([64, 32, 28, 28])
MAXPOOL_1 : torch.Size([64, 32, 14, 14])
CNN_2 : torch.Size([64, 64, 14, 14])
MAXPOOL : torch.Size([64, 64, 7, 7])
FC: torch.Size([64, 3136])
FINAL : torch.Size([64, 10])

The code runs fine for first 500 iterations:
Iteration :500 Loss :2.2414138317108154 Accuracy :10
But again given the above mentioned error

Abhishek_Kumar2 · September 23, 2019, 9:52am

The above error is sorted as the target given for trainning was wrong.

cthnguyen · October 3, 2019, 3:12pm

Hello, I am trying to modify the spatial transformer network code for input images of size (4, 256, 256).

Here is the code of the model :

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(4, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        # self.fc1 = nn.Linear(320, 50)
        self.fc1 = nn.Linear(20*61*61, 50) # hard coded, 74420 = np.prod(x.shape[1:]) from forward
        self.fc2 = nn.Linear(50, 10)

        # Spatial transformer localization-network
        self.localization = nn.Sequential(
            nn.Conv2d(4, 8, kernel_size=7),
            nn.MaxPool2d(2, stride=2),
            nn.ReLU(True),
            nn.Conv2d(8, 10, kernel_size=5),
            nn.MaxPool2d(2, stride=2),
            nn.ReLU(True)
        )

        # Regressor for the 3 * 2 affine matrix
        self.fc_loc = nn.Sequential(
            # nn.Linear(10 * 3 * 3, 32), # à changer 10*3*3
            nn.Linear(10 * 60 * 60, 32), # custom
            nn.ReLU(True),
            nn.Linear(32, 3 * 2)
        )

        # Initialize the weights/bias with identity transformation
        self.fc_loc[2].weight.data.zero_()
        self.fc_loc[2].bias.data.copy_(torch.tensor([1, 0, 0, 0, 1, 0], dtype=torch.float))

    # Spatial transformer network forward function
    def stn(self, x):
        xs = self.localization(x)
        print("End of localization (in STN):", xs.shape)
        # xs = xs.view(-1, 10 * 3 * 3) # à changer pour une autre taille image 
        # (2, 90)
        xs = xs.view(-1, 10 * 60 * 60) # custom 
        theta = self.fc_loc(xs)
        print('During STN xs :', xs.shape)
        theta = theta.view(-1, 2, 3) # shape : (2, 2, 3)
        print("During STN theta :", theta.shape,'\n')

        grid = F.affine_grid(theta, x.size())
        x = F.grid_sample(x, grid)

        return x

    def forward(self, x):
        # transform the input
        x = self.stn(x) # shape (2, 10, 3, 3)
        print('End STN :', x.shape)

        # Perform the usual forward pass
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        print('End conv1 :', x.shape)
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        print("End conv2_drop : ", x.shape)
        # x = x.view(-1, 320)
        # x = x.view(-1, np.prod(x.shape[1:])) # custom 
        x = x.view(x.size(0), -1)
        print(x.shape)
        x = F.relu(self.fc1(x))
        print('Fully connected 1 :', x.shape)
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        print('Fully connected 2 :', x.shape)
        return F.log_softmax(x, dim=1)

When I print the summary with torchsummary, I have good dimensions, but I have the same error as everyone else. Could anyone help me debug ? I can’t find where it’s wrong.

Shariq_Ali · November 29, 2019, 11:23am

Hello i’m new on machine learning i want to train unet model on my data set but i’m getting the above mentioned error. could you please help me to fix it.
My loss Function
def cross_entropy2d(input, target, weight=None, size_average=True):

target = target.squeeze(1)

n, c, h, w = input.size()
nt, ct, ht, wt = target.size()
# Handle inconsistent size between input and target
if h != ht and w != wt:  # upsample labels
    input = F.interpolate(input, size=(ht, wt), mode="bilinear", align_corners=True)

input = input.transpose(1, 2).transpose(2, 3).contiguous().view(-1, c)
target = target.view(-1)
loss = F.cross_entropy(
    input, target, weight=weight, size_average=size_average, ignore_index=250
)
return loss

My Unet Model

import torch.nn as nn

from ptsemseg.models.utils import unetConv2, unetUp

class unet(nn.Module):
def init(
self, feature_scale=4, n_classes=3, is_deconv=True, in_channels=3, is_batchnorm=True
):
super(unet, self).init()
self.is_deconv = is_deconv
self.in_channels = in_channels
self.is_batchnorm = is_batchnorm
self.feature_scale = feature_scale

    filters = [64, 128, 256, 512, 1024]
    filters = [int(x / self.feature_scale) for x in filters]

    # downsampling
    self.conv1 = unetConv2(self.in_channels, filters[0], self.is_batchnorm)
    self.maxpool1 = nn.MaxPool2d(kernel_size=2)

    self.conv2 = unetConv2(filters[0], filters[1], self.is_batchnorm)
    self.maxpool2 = nn.MaxPool2d(kernel_size=2)

    self.conv3 = unetConv2(filters[1], filters[2], self.is_batchnorm)
    self.maxpool3 = nn.MaxPool2d(kernel_size=2)

    self.conv4 = unetConv2(filters[2], filters[3], self.is_batchnorm)
    self.maxpool4 = nn.MaxPool2d(kernel_size=2)

    self.center = unetConv2(filters[3], filters[4], self.is_batchnorm)

    # upsampling
    self.up_concat4 = unetUp(filters[4], filters[3], self.is_deconv)
    self.up_concat3 = unetUp(filters[3], filters[2], self.is_deconv)
    self.up_concat2 = unetUp(filters[2], filters[1], self.is_deconv)
    self.up_concat1 = unetUp(filters[1], filters[0], self.is_deconv)

    # final conv (without any concat)
    self.final = nn.Conv2d(filters[0], n_classes, 1)

def forward(self, inputs):
    conv1 = self.conv1(inputs)
    maxpool1 = self.maxpool1(conv1)

    conv2 = self.conv2(maxpool1)
    maxpool2 = self.maxpool2(conv2)

    conv3 = self.conv3(maxpool2)
    maxpool3 = self.maxpool3(conv3)

    conv4 = self.conv4(maxpool3)
    maxpool4 = self.maxpool4(conv4)

    center = self.center(maxpool4)
    up4 = self.up_concat4(conv4, center)
    up3 = self.up_concat3(conv3, up4)
    up2 = self.up_concat2(conv2, up3)
    up1 = self.up_concat1(conv1, up2)

    final = self.final(up1)

    return final

when i print the input.size in loss function i’will get following result

please help me

ptrblck · November 30, 2019, 7:55am

Based on your code, it looks like you are working on a pixel-wise classification (e.g. segmentation use case).
Also, it looks like you would like to push all pixels into the batch dimension and just flatten the target.
This should generally work with nn.CrossEntropyLoss, although the reshaping is not necessary, as you can work with spatial outputs and targets.

However, the target shape looks wrong and it seems you might be using a one-hot encoded target.
For a segmentation use case with nn.CrossEntropyLoss, the model output should have the shape [batch_size, nb_classes, height, width], while the target should be a LongTensor with the shape [batch_size, height, width] containing the class indices in the range [0, nb_classes-1].
If you have the one-hot encoding in dim3 in your target tensor, you could simply call:

target = torch.argmax(target, 3)

to get the class indices.

On the other hand, if you are dealing with a multi-label classification, i.e. each pixel can belong to more than a single class, you should permute the target to have the same shape as your model output, and use nn.BCEWithLogitsLoss instead.

william_hero · January 11, 2020, 12:49pm

Thanks. It really helps.

saaaaa · January 27, 2020, 2:06pm

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
import torch
import os
import cv2
import matplotlib.pyplot as plt
execution_path=os.getcwd()
filenames=[“no”,“yes”]
training_data=[]
data_set=[]
data_label=[]
execution_path=os.getcwd()
dirname=’/kaggle/input/brain-mri-images-for-brain-tumor-detection/brain_tumor_dataset/’

for filename in filenames:
#print(os.path.join(dirname, filename))
path=os.path.join(dirname,filename)
filename_index=filenames.index(filename)
for image in os.listdir(path):
try:
img_array = cv2.imread(os.path.join(path, image), cv2.IMREAD_GRAYSCALE)
img_array = img_array.astype(np.float32)
img_array = cv2.resize(img_array, (64, 64))
#img = Image.open(self.paths[index]).convert(‘RGB’)
training_data.append([img_array, filename_index])
except Exception as e:
print(e)
np.random.shuffle(training_data)
for feature, label in training_data:
data_set.append(feature)
data_label.append(label)
x_train,y_train,x_test,y_test=train_test_split(data_set, data_label, test_size = 0.1,
random_state = 45)
x_train = np.array(x_train).reshape(-1, 1, 64, 64)
y_train=np.array(y_train).reshape(-1,1,64,64)
x_trainn=torch.from_numpy(x_train)
y_trainn=torch.from_numpy(y_train)

import torch.nn as nn
import torch

class CNN(nn.Module):
def init(self):
super(CNN,self).init()
self.layer1=nn.Sequential(nn.Conv2d(1,16,kernel_size=3,stride=1,padding=1),nn.ReLU(),nn.MaxPool2d(kernel_size=2,stride=2),nn.Dropout(0.5))
self.layer2=nn.Sequential(nn.Conv2d(16,32,kernel_size=3,stride=1,padding=1),nn.ReLU(),nn.MaxPool2d(kernel_size=2,stride=2),nn.Dropout(0.5))
self.layer3=nn.Sequential(nn.Conv2d(32,64,kernel_size=3,stride=1,padding=1),nn.ReLU(),nn.MaxPool2d(kernel_size=2,stride=2,padding=1),nn.Dropout(0.5))
self.fc1=nn.Linear(5184,2,bias=True)

    #self.layer4=nn.Sequential(self.fc1,nn.ReLU(),nn.Dropout(0.5))
    #self.fc2=nn.Linear(630,5,bias=True)
def forward(self,X):
    out=self.layer1(X)
    out=self.layer2(out)
    out=self.layer3(out)
    out=out.view(out.size(0),-1)
    out=self.fc1(out)
    #out=self.fc2(out)
    return out

model=CNN()
print(model)

criterion=nn.CrossEntropyLoss()
optimizer=torch.optim.Adam(model.parameters(),lr=0.01)

y_pred=model(x_trainn)
loss=criterion(y_pred,y_trainn)
optimizer.zero_grad()
loss.backward()
optimizer.step()

P.S. Expected input batch_size (227) to match target batch_size (26).

saaaaa · January 27, 2020, 2:07pm

Please tell me how to remove the error?

saaaaa · January 27, 2020, 5:33pm

@ptrblck sir plzz reply

CesMak · February 19, 2020, 7:20pm

Hey there,
I have a similar issue and cannot find the solution:

# tested with python 3.7.5
from torch.utils.data import Dataset
import ast
import torch
import torch.nn as nn
from torch.autograd import Variable

# Links:
# https://www.kaggle.com/danieldagnino/training-a-classifier-with-pytorch

class my_model(nn.Module):
    def __init__(self,n_in=60, n_hidden=10, n_out=60):
        super(my_model,self).__init__()
        self.n_in  = n_in
        self.n_out = n_out

        self.linearlinear = nn.Sequential(
            nn.Linear(self.n_in,  self.n_out, bias=True),   # Hidden layer.
            )
        self.logprob = nn.LogSoftmax(dim=1)                 # -Log(Softmax probability).

    def forward(self,x):
        x = self.linearlinear(x)
        print("In forward:", x.shape)
        x = self.logprob(x)
        print("In forward:", x.shape)
        return x

class TESNamesDataset(Dataset):
    def __init__(self, data_root):
        self.samples = []
        with open(data_root,"r") as f:
            self.samples = [ [ast.literal_eval(ast.literal_eval(elem)[0]), ast.literal_eval(ast.literal_eval(elem)[1])] for elem in f.read().split('\n') if elem]

    def __len__(self):
        return len(self.samples)

    def __getitem__(self, idx):
        # Function that returns one input and one output (label)
        return torch.Tensor(self.samples[idx][0]), torch.Tensor(self.samples[idx][1])

if __name__ == '__main__':
    from torch.utils.data import DataLoader
    my_data = TESNamesDataset('first_move.txt')
    my_loader = DataLoader(my_data, batch_size=1, num_workers=0)

    model = my_model()
    criterium = nn.NLLLoss()
    optimizer = torch.optim.Adam(model.parameters(),lr=0.1,weight_decay=1e-4)

    # Taining.
    for k, (data, target) in enumerate(my_loader):
        print("\nEpoch:", k)
        print(data)
        print(target)
        #print(data.view(-1).shape) # 1x60
        data   = Variable(data,         requires_grad=False) # input
        target = Variable(target.long(), requires_grad=False) # output

        # Set gradient to 0.
        optimizer.zero_grad()
        # Feed forward.
        pred = model(data)

        print("Pred.shape:", pred.shape)
        print("target.shape:", target.shape)

        print(target.view(-1).shape)
        #print(eee)

        loss = criterium(pred, target.view(-1))

        # Gradient calculation.
        loss.backward()

        # Print loss every 10 iterations.
        if k%10==0:
            print('Loss {:.4f} at iter {:d}'.format(loss.item(),k))

        # Model weight modification based on the optimizer.
        optimizer.step()

Output:

Epoch: 0
tensor([[0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0.,
         0., 1., 0., 0., 0., 0., 0., 0., 1., 0., 1., 0., 0., 0., 0., 0., 1., 0.,
         0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 1., 1., 1., 0., 0., 0., 1., 1.,
         1., 0., 1., 0., 1., 0.]])
tensor([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 1., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 0., 0., 0., 0.]])
In forward: torch.Size([1, 60])
In forward: torch.Size([1, 60])
Pred.shape: torch.Size([1, 60])
target.shape: torch.Size([1, 60])
torch.Size([60])

Any Idea why I get this error:
ValueError: Expected input batch_size (1) to match target batch_size (60).

Solution:
I found a solution, my labels were binary 60x1 which is also how I wanted them to be! But with this I always got the above error. For this reason I made the binary to an int and then it worked. I do not know however if I still need the 60 outputs now, actually I just need one number (showing the binary index).

rai24 · March 2, 2020, 10:33pm

@ptrblck

Define Autoencoder

#dropout_p= 0.5
class Autoencoder(nn.Module):

def __init__(self):
    super(Autoencoder, self).__init__()
    self.encoder = nn.Sequential(
       # nn.Dropout2d(dropout_p= 0.5),
        nn.Conv2d(1, 16,kernel_size=3,stride =1, padding= 1,bias= False),  
        nn.ReLU(True),
        nn.MaxPool2d( 2, 2),  
        nn.Conv2d(16,32,kernel_size=3,stride =1, padding=1),  
        nn.ReLU(True),
        nn.MaxPool2d(2, 2),  
        nn.Conv2d(32,64,kernel_size=3,stride =1, padding= 1),  
        nn.ReLU(True),
        nn.MaxPool2d(2, 2),  
        
    )
    self.decoder = nn.Sequential(
        nn.MaxUnpool2d(2, 2),
        nn.ConvTranspose2d(64, 32, kernel_size=3), 
        nn.ReLU(True),
        nn.MaxUnpool2d(2, 2),
        nn.ConvTranspose2d(32, 16, kernel_size=3),  
        nn.ReLU(True),
        nn.MaxUnpool2d(2, 2),
        nn.ConvTranspose2d(16, 8, kernel_size=3)
        #nn.Tanh()
    )

def forward(self, x):
    x = self.encoder(x)
    encoded_x = x
    x = self.decoder(x)
    print(x.shape)
    return x, encoded_x

num_epochs = 5
learning_rate = 1e-5
model= Autoencoder()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
for epoch in range(num_epochs):
for data in loader_train_set:
img, label = data
img = img.view(-1,1,320,240)
img = Variable(img)
label = Variable(label)
encoded_output = model.encoder(img)
loss = criterion(encoded_output, label)
optimizer.zero_grad()
loss.backward()
optimizer.step()

print(‘epoch [{}/{}], loss:{:.4f}’.format(epoch+1, num_epochs, loss.data()))

ValueError: Expected input batch_size (60) to match target batch_size (20).
I tried solving using above provided solutions, but didn’t worked for my code. Need help.

ptrblck · March 2, 2020, 11:29pm

Could you post the shapes of img, encoded_output, and label?

Also, Variables are deprecated since PyTorch 0.4.0, so you can just use tensors now.

@CesMak
nn.NLLLoss expects a target as a LongTensor containing the class indices in the range [0, nb_classes-1], while your target seems to be one-hot encoded.
You could try to create the expected target via target = torch.argmax(target, 1).

rai24 · March 3, 2020, 2:27pm

Thank you so much!
shapes of img , encoded_output , and label are as below!
Label = torch.Size([20])
img = torch.Size([60, 1, 320, 240])
encoded_output= torch.Size([60, 64, 40, 30])

ptrblck · March 3, 2020, 4:24pm

Your encoded_output shape looks as if it contains logits for a multi-class segmentation, while the targets are for a multi-class classification with another batch size.

By best guess at the moment is that this call img = img.view(-1,1,320,240) creates a new batch size.
What is the shape of img before the view op?

rai24 · March 3, 2020, 4:38pm

Its is torch.Size([20, 3, 240, 320])

rai24 · March 3, 2020, 4:42pm

Yes, you are right. View operation is making the new batch size, but I used it because of the channel mismatch issue, I am using the grey scale image and before the view operation the size of the img is [20,3,240,340], however it should be [20,1,240,340].

ptrblck · March 3, 2020, 4:43pm

You can’t use view in this case.
If you are sure you are dealing with a grayscale image, you could either load the image in the grayscale format from the beginning or slice the tensor via img = img[:, 0], since all channels should contain the same value.

rai24 · March 3, 2020, 4:48pm

img, label = data
print(img.size())
img = torch.tensor(img)
img = img[:, 0]
encoded_output = model.encoder(img)
is giving me this error
RuntimeError: Expected 4-dimensional input for 4-dimensional weight 16 1 3 3, but got 3-dimensional input of size [20, 240, 320] instead

ptrblck · March 3, 2020, 4:49pm

Sorry, my bad. Unsqueeze img at dim1 or use img = img[:, 0:1]

rai24 · March 3, 2020, 4:53pm

Apologies, for throwing out so many questions, but I am trying quite a long time and I am just stuck.
Got below error now.
RuntimeError: only batches of spatial targets supported (3D tensors) but got targets of dimension: 1