ValueError: Expected input batch_size (324) to match target batch_size (4)

Thanks for the update.
In your code you are using enumerate, which will return an index as the first return value, so you should change the DataLoader loop to:

for i, data in enumerate(trainloader):

Once this is fixed, you’ll run into a shape mismatch error and you should change:

# self.fc1 = nn.Linear(16*8*8, 128)
self.fc1 = nn.Linear(256, 128)
# x = x.view(-1,16*8*8) 
x = x.view(x.size(0), -1)

Afterwards, you’ll run into issues since you are using the deprecated .data attribute, so change it to item().

it worked! Thank you so much. I was stuck for 3 days :blush:

i have this error
ValueError: Expected input batch_size (1) to match target batch_size (20).
import torch.nn as nn

import torch.nn.functional as F

define the CNN architecture

class Net(nn.Module):

def init(self):

super(Net, self).__init__()

self.conv1 = nn.Conv2d(3, 16, kernel_size=3, padding=1)

self.conv2 = nn.Conv2d(16, 32, kernel_size=3, padding=1)

self.conv3 = nn.Conv2d(32, 64, kernel_size=3, padding=1)

self.pool = nn.MaxPool2d(2, 2)

self.fc1 = nn.Linear(184320, 512)

self.fc2 = nn.Linear(512, 133)

self.dropout = nn.Dropout(0.5)

def forward(self, x):

## Define forward behavior

x = self.pool(F.relu(self.conv1(x)))

x = self.pool(F.relu(self.conv2(x)))

x = self.pool(F.relu(self.conv3(x)))


x = x.view(-1, 184320)

x = F.relu(self.fc1(x))

x = self.dropout(x)

x = self.fc2(x)

return x

model = Net()


if train_on_gpu:


Change this line of code:

x = x.view(-1, 184320)


x = x.view(x.size(0), -1)

and rerun the code.
I guess your calculated feature size might be wrong and thus you are changing the batch size of x.
If you get a shape mismatch error in the linear layer, you would need to adapt the in_features of this layer to the expected value.


Hey, I have a similar problem with my code. I’ve read all the threads, still, I couldn’t figure it out.

my data is 16261 rows * 243 columns input (numerical data) and 1d output.

I changed my input shape from torch.Size([1, 16261, 243]) to torch.Size([1, 243, 16262]) because of the following lines in some post.

When you enter the one-dimensional convolution, you need to convert 32*35*256 to 32*256*35, because the one-dimensional convolution is swept in the last dimension.

So, my Input and Output shapes are, as follows:

Input tensor shape:  torch.Size([1, 243, 16261])
output tensor shape:  torch.Size([16261])
number of output classes = 14

this is the model_shapes (Conv1d)

Input shape:  torch.Size([1, 243, 16261])
Conv1d 1st layer shape:  torch.Size([1, 60, 8129])
conv1d 2nd layer shape:  torch.Size([1, 120, 4063])
output size after flattening:  torch.Size([1, 487560])
1st FC layer shape:  torch.Size([1, 800])
2nd FC layer shape:  torch.Size([1, 14])
tensor([[4.1403e+16, 0.0000e+00, 0.0000e+00, 4.6436e+16, 7.4836e+15, 0.0000e+00,
         4.8121e+16, 0.0000e+00, 0.0000e+00, 7.0205e+16, 3.3093e+16, 2.5460e+16,
         0.0000e+00, 0.0000e+00]], grad_fn=<ReluBackward0>)

My error is

Expected input batch_size (1) to match target batch_size (16261)

Thank you in Advance!!

Could you explain your use case a bit?
Based on the description:

it seems that you are dealing with 16261 samples each having a temporal dimension of 243?
Or is this representing a single batch with a temporal dimension of 16261 and 243 features?

Currently the error is raised, as you are treating the input data as a single sample, while the target is a batch containing 16261 samples.

I have 243 features that give a single output. So I have 16261 like that. my data size is torch.Size[(16261, 243])

And, about the batch size, I just used .unsqueeze(dim=0) to have the third dimension, which is the batch size, gives size 1 to the batch. so now my data size has become torch.Size[(1, 16261, 243]).

I can change my batch size to 72 using .expand()

Here are my code cells,

x_train = x_train_tensor.unsqueeze(dim=0)
x_train = x_train.permute(0,2,1)
x_train = x_train.expand(72, 243, 16261)
print('Input tensor reshaped: ', x_train.shape)

y_train = y_train_tensor
print('output tensor shape: ',y_train.shape)
Input tensor reshaped:  torch.Size([72, 243, 16261])
output tensor shape:  torch.Size([16261])

Conv1d model

# Model

class CNN(nn.Module):
  def __init__(self):
    super(CNN, self).__init__()
    # I have given 243 as the input to the 1st layer because i have 243 features that gives one output. P.S. I'm not sure where i'm right here.
    self.layer1 = nn.Sequential(nn.Conv1d(243, 60, kernel_size=5, stride=1, padding=1), nn.ReLU(), nn.MaxPool1d(kernel_size=2, stride=2))
    self.layer2 = nn.Sequential(nn.Conv1d(60, 120,  kernel_size=5, stride=1, padding=1), nn.ReLU(), nn.MaxPool1d(kernel_size=2, stride=2))
    self.drop_out = nn.Dropout(0.5)
    self.fc1 = nn.Linear(120*4063, 800)
    self.relu_act = nn.ReLU()
    self.fc2 = nn.Linear(800, 14)
  def forward(self, x, prints = False):
    if prints: print('Input shape: ', x_train.shape)

    out = self.layer1(x)
    if prints: print('Conv1d 1st layer shape: ', out.shape)
    out = self.layer2(out)
    if prints: print('conv1d 2nd layer shape: ', out.shape)
    out = self.drop_out(out)
    out = out.view(out.size(0), -1)
    if prints: print(' out size after flattening: ', out.shape)

    out = F.relu(self.fc1(out))
    if prints: print('1st FC layer shape: ', out.shape)

    out = F.relu(self.fc2(out))
    if prints: print('2nd FC layer shape: ', out.shape)
    out = F.log_softmax(out, dim = 1)
    return out
Input shape:  torch.Size([72, 243, 16261])
Conv1d 1st layer shape:  torch.Size([72, 60, 8129])
conv1d 2nd layer shape:  torch.Size([72, 120, 4063])
output size after flattening:  torch.Size([72, 487560])
1st FC layer shape:  torch.Size([72, 800])
2nd FC layer shape:  torch.Size([72, 14])

Checking whether it’s working…

loss_function = nn.CrossEntropyLoss()
loss = loss_function(prediction, y_train)

my Error

ValueError: Expected input batch_size (72) to match target batch_size (16261)

yes, I just changed the batch size to 72. the error is the same.

could you please help me to sort this out?

This sounds as if you are dealing with 16261 samples, each having 243 features. The shape looks right for e.g. a linear layer, but is missing a dimension for nn.Conv1d.

I don’t understand this step. Is 16261 supposed to be the temporal dimension now?

if my data shape is torch.Size([16261, 243]), I’m getting an error, something like, Expected input dim=3, got dim=2.
So, I add the third dimension using .unsqueeze()

are you saying it is not necessary to add the third dimension but to change something for nn.Conv1d

if so, what should I change?

you saying something about the missing dimension. what’s that?
so far the examples I came across were like the model I created.

Thank you for your response!

nn.Conv1d expects an input in the shape [batch_size, channels, seq_len] so which of the two dimensions would refer to the expected ones?
Since you are using 2 dimensions one is missing. Based on your description I guess 16261 would be the number of samples and thus the batch_size, while 243 could either be the channel dimension or the temporal dimension. In the latter case, you wouldn’t need to use nn.Conv1d and could just use a linear layer. In the former case, you would need to unsqueeze(1) to create a channel dimension with the size of 1.

ValueError: Expected input batch_size (7) to match target batch_size (64).
Hi @ptrblck could you please tell why is this error coming
class Network(nn.Module):

def init(self, C, num_classes, layers, criterion, steps=4, multiplier=4, stem_multiplier=3):

super(Network, self).__init__()

self._C = C

self._num_classes = num_classes

self._layers = layers

self._criterion = criterion

self._steps = steps

self._multiplier = multiplier

C_curr = stem_multiplier*C

self.stem = nn.Sequential(

  nn.Conv2d(3, C_curr, 3, padding=1, bias=False),



C_prev_prev, C_prev, C_curr = C_curr, C_curr, C

self.cells = nn.ModuleList()

reduction_prev = False

for i in range(layers):

  if i in [layers//3, 2*layers//3]:

    C_curr *= 2

    reduction = True


    reduction = False

  cell = Cell(steps, multiplier, C_prev_prev, C_prev, C_curr, reduction, reduction_prev)

  reduction_prev = reduction

  self.cells += [cell]

  C_prev_prev, C_prev = C_prev, multiplier*C_curr

self.global_pooling = nn.AdaptiveAvgPool2d(1)

self.classifier = nn.Linear(C_prev, num_classes)


def new(self):

model_new = Network(self._C, self._num_classes, self._layers, self._criterion).cuda()

for x, y in zip(model_new.arch_parameters(), self.arch_parameters()):

return model_new

def forward(self, input):

s0 = s1 = self.stem(input)

for i, cell in enumerate(self.cells):

  if cell.reduction:

    weights = F.softmax(self.alphas_reduce, dim=-1)

    n = 3

    start = 2

    weights2 = F.softmax(self.betas_reduce[0:2], dim=-1)

    for i in range(self._steps-1):

      end = start + n

      tw2 = F.softmax(self.betas_reduce[start:end], dim=-1)

      start = end

      n += 1

      weights2 =[weights2,tw2],dim=0)


    weights = F.softmax(self.alphas_normal, dim=-1)

    n = 3

    start = 2

    weights2 = F.softmax(self.betas_normal[0:2], dim=-1)

    for i in range(self._steps-1):

      end = start + n

      tw2 = F.softmax(self.betas_normal[start:end], dim=-1)

      start = end

      n += 1

      weights2 =[weights2,tw2],dim=0)

  s0, s1 = s1, cell(s0, s1, weights,weights2)

out = self.global_pooling(s1)


logits = self.classifier(out)

return out,logits


I don’t fully understand how your code is working, but

weights2 =[weights2,tw2],dim=0)

looks a bit weird, as it would change dim0, which is usually the batch dimension.
Print the shape of the input as well as the intermediate tensors and make sure they are keeping their batch size.

Hi @ptrblck thanks for your reply,
torch.Size([64, 256])
torch.Size([64, 2])
I tried printing the dimensions of out and logits respectively as above

and input shape is:torch.Size([64, 3, 32, 32])

pls help anyone :cold_sweat: with this

The intermediate shapes look alright so far, as their batch size is equal to the input shape.
However, do you know where the input batch_size (324) and target batch_size (4) comes from as neither is matching the printed 64?

Hi @ptrblck!

I am very new to deep learning and ran into the same error. To provide a bit of context I am using a pretrained model that classifies wether images are in distribution or out of distribution. My goal is to apply this model to a new dataset that should be out of distribution and analyse the results. However, the new dataset is not a simple zip of png like the other datasets (like ImageNet) the model was trained and tested on, but an HDF5 gzip. Having never worked with such files before I still managed to load the dataset thanks to h5py and then loaded the resulting array on pytorch using tensor flow data loader. However I am now encountering a wide range of errors related to mismatches in dimensions, among which the last one concerns the following “ValueError: Expected input batch_size (9) to match target batch_size (1)”.

Although I first came here to try and solve this error, I am now wondering if the way I loaded the data might simply just not be compatible with the model I am trying to use. Would you maybe have any advices regarding loading HDF5 files?

I would be very grateful for any help you could provide :slight_smile:

I don’t think the error is created by using HDF5 by itself, but is most likely caused by a wrong view operation performed on the data.
Check all view operations and make sure the batch size is kept equal e.g. via x = x.view(x.size(0), -1).
Your current error message mentions the shape mismatch of the batch dimension as 9 vs. 1, so check which batch size is expected and use it to narrow down which tensor shape is wrong.

1 Like

Thanks a lot for your answer!! Using x = x.view(x.size(0), -1) actually worked but now I am running into several other errors related, like the previous ones to the shape of data, inputs etc. Although I could continue trying to fix them one by one, I am actually afraid this would not converge towards anything in the end. Although the pretrained model I am using is in PyTorch, the recommended data loader for the new dataset I am trying to use is on Keras. I am wondering if I should not rather use Keras to load the data and afterwards find a way to use it in PyTorch. Would that be realistic to try?

I don’t know if the Keras data loading pipeline is creating the shape mismatches. To check it you could create random input tensors in the expected shape, pass it to the model, train one step, and verify that your general model training is working fine. Once this is done you could check the data shapes created by the Keras data loading and make sure they match the expected shapes.
On the other hand, you could of course also write a custom Dataset in PyTorch directly as explained here.

1 Like