RuntimeError: stack expects each tensor to be equal size, but got [1, 691, 1228] at entry 0 and [1, 90, 90] at entry 1

Hello guys.
I’m trying to run this example for my data.

My data: Dataset = [1854,1,90,90]

transform = transforms.Compose([transforms.Grayscale(num_output_channels=1),transforms.ToTensor(),transforms.Normalize([0.5], [0.5])])
dataset2 = datasets.ImageFolder(path_data_training_images, transform=transform)

dataloader =, batch_size=5, shuffle=True, num_workers=0)
print(“RBC”, np.shape(dataloader), type(dataloader), len(dataloader))

dataiter2 = iter(dataloader)
images2, labels2 =


optimizer_G = torch.optim.RMSprop(generator.parameters(),
optimizer_D = torch.optim.RMSprop(discriminator.parameters(),

Tensor = torch.cuda.FloatTensor if cuda else torch.FloatTensor




batches_done = 0
for epoch in range(opt.n_epochs):

for i, (imgs, _) in enumerate(dataloader):

    # Configure input
    real_imgs = Variable(imgs.type(Tensor))
    #print("Type of every element:", real_imgs.dtype,"Number of axes:", real_imgs.ndim,"Shape of tensor:", real_imgs.shape)
    print("Total number of elements (64*1*28*28): ", tf.size(real_imgs).numpy())

    # ---------------------
    #  Train Discriminator
    # ---------------------


    # Sample noise as generator input
    z = Variable(Tensor(np.random.normal(0, 1, (imgs.shape[0], opt.latent_dim))))

    # Generate a batch of images
    fake_imgs = generator(z).detach()
    # Adversarial loss
    #print("img_real",np.shape(real_imgs), "img_fake", np.shape(fake_imgs))
    loss_D = -torch.mean(discriminator(real_imgs)) + torch.mean(discriminator(fake_imgs))


    # Clip weights of discriminator
    for p in discriminator.parameters():, opt.clip_value)

    # Train the generator every n_critic iterations
    if i % opt.n_critic == 0:

        # -----------------
        #  Train Generator
        # -----------------


        # Generate a batch of images
        gen_imgs = generator(z)
        # Adversarial loss
        loss_G = -torch.mean(discriminator(gen_imgs))


            "[Epoch %d/%d] [Batch %d/%d] [D loss: %f] [G loss: %f]"
            % (epoch, opt.n_epochs, batches_done % len(dataloader), len(dataloader), loss_D.item(), loss_G.item())

    if batches_done % opt.sample_interval == 0:
        save_image([:25], "images/%d.png" % batches_done, nrow=5, normalize=True)
    batches_done += 1

But the error appears:
RuntimeError: stack expects each tensor to be equal size, but got [1, 691, 1228] at entry 0 and [1, 90, 90] at entry 1

And when I use collate_fn, I have the following problem:


def collate_fn(data):
img, bbox = data
zipped = zip(img, bbox)
return list(zipped)

ValueError: too many values to unpack (expected 2)
Any help please ?

Based on the error message the collate_fn fails to create a batch of the images returned by Dataset.__getitem__, since they have a different spatial resolution.
Often, you would resize the images to a fixed size, which would avoid this error.
To do so, you could use torchvision.transforms.Resize and add it to your transform.

Your second approach of using a custom collate_fn would also work, but note that you would have to pass the images one-by-one to the model if you are not padding or resizing them in the training loop.

PS: you can post code snippets by wrapping them into three backticks ```, which makes debugging a bit easier. :wink:

Thank you! I included


and now it’s working!