Concatenating images

Niki · March 26, 2019, 10:48pm

I want to write a code in by Pytorch that concatenate two images (32 by 32), in the way the output image becomes (64 by 32), how should I do that?

Thank you~

ptrblck · March 26, 2019, 11:06pm

torch.cat should do the job:

a = torch.randn(3, 32, 32)
b = torch.randn(3, 32, 32)
c = torch.cat((a, b), 1)
print(c.shape)
> torch.Size([3, 64, 32])

Niki · March 26, 2019, 11:24pm

Thank you for your reply. I saw the torch.cat but don’t know how I can apply it here:

import torch
import torchvision
import torchvision.transforms as transforms

transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root=’./data’, train=True,
download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=5,
shuffle=True, num_workers=1)

classes = (‘plane’, ‘car’, ‘bird’, ‘cat’,‘deer’, ‘dog’, ‘frog’, ‘horse’, ‘ship’, ‘truck’)

for epoch in range(3):
for i, data in enumerate(trainloader, 0):
inputs, labels = data
#Final_input=image 64 by 32

ptrblck · March 26, 2019, 11:30pm

You are currently using a batch size of 5, which won’t work if you would like to concatenate two images.
If you use an even batch size, you could concatenate the images using this code:

inputs = torch.cat((inputs[::2], inputs[1::2]), 2)

Since you are using shuffle=True, I assume the pairs used to create the larger tensors do not matter. Is this correct or would you like to concatenate specific pairs of image tensors?

Niki · March 26, 2019, 11:58pm

Thank you very much! It’s working. No, I don’t consider specific pairs.
If I want to concatenate labels at the same time, this is correct?

labels = torch.cat((labels, labels))

ptrblck · March 27, 2019, 12:13am

Not really, as your samples not contain more than a single class and you are now dealing with a multi-label classification use case.
In this setup, you could create multi-hot targets and change the criterion to e.g. nn.BCEWithLogitsLoss:

one_hot_labels = torch.nn.functional.one_hot(labels)
one_hot_labels = one_hot_labels[::2] | one_hot_labels[1::2]
one_hot_labels = one_hot_labels.float()

Niki · March 27, 2019, 12:21am

I tried that, it’s getting me this error:

AttributeError: module ‘torch.nn.functional’ has no attribute ‘one_hot’

the torch version is 1.0.1.post2

ptrblck · March 27, 2019, 12:25am

I’m using 1.0.0.dev20190312. You could use the nightly build or if you prefer to stay in 1.0.1post2, you could use this code to create the one-hot encoded targets:

one_hot_labels = torch.zeros(labels.size(0), nb_classes, dtype=torch.long).scatter_(1, labels.unsqueeze(1), 1)

Niki · March 27, 2019, 12:40am

Thank you! That’s work, but I realized the “one_hot_labels” at the end of the loop has a size [2,10], it’s possible that has size [2,10] inside the loop?
If I want to concatenate more than two images, which parts of inputs and one_hot_labels should be changed?

ptrblck · March 27, 2019, 12:58am

A more general approach could use torch.split and a better .scatter_ call:

nb_cat = 3  # Number of images to concatenate

for i, data in enumerate(trainloader, 0):
    inputs, labels = data
    batch_size = inputs.size(0)
    split_size = batch_size // nb_cat
    inputs = torch.cat(inputs.split(split_size), 2)

    labels = torch.stack(labels.split(nb_cat))
    one_hot_labels = torch.zeros(inputs.size(0), nb_classes, dtype=torch.long).scatter_(1, labels, 1)
    one_hot_labels = one_hot_labels.float()

EDIT: Always assuming the batch size if divisible by nb_cat without a remainder.
I haven’t tested edge cases!

Niki · March 27, 2019, 1:09am

nb_cat = 3 # Number of images to concatenate
nb_classes=10

for epoch in range(2):
for i, data in enumerate(trainloader, 0):
inputs, labels = data
batch_size = inputs.size(0)
split_size = batch_size // nb_cat
inputs = torch.cat(inputs.split(split_size), 2)
labels = torch.stack(labels.split(nb_cat))
one_hot_labels = torch.zeros(inputs.size(0), nb_classes, dtype=torch.long).scatter_(1, labels, 1)
one_hot_labels = one_hot_labels.float()

this gives me this error:
RuntimeError: split_size can only be 0 if dimension size is 0, but got dimension size of 2

ptrblck · March 27, 2019, 1:13am

It seems your batch size is smaller than nb_cat, thus the split_size is 0.
This might be the case if you set a smaller batch_size in your DataLoader or alternatively the last batch in the training loop might be smaller, if the number of samples is not divisible by the batch size without a remainder. You could avoid this issue if you set drop_last=True in your DataLoader (or handle the smaller batch in another way).

Niki · March 27, 2019, 1:45am

I added drop_last=True and it’s working now, for batch_size=6, nb_cat=2, nb_classes=10, epoch=3:
print(inputs.shape): torch.Size([3, 3, 64, 32])
print(one_hot_labels.shape): torch.Size([3, 10])

here (in size) the first element (3) is split_size, am I right? Is there any ways that be dependent on batch_size, like Size([6, 3, 64, 32]) or Size([6, 10])?

Niki · March 27, 2019, 10:00am

when I use these inputs and one_hot_labels in my code, in the calculating loss part, shows me this error:

RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 ‘target’

this is because of float in one_hot_labels? how can I solve it?

ptrblck · March 27, 2019, 12:25pm

If you concatenate the images, you’ll get “less” samples, so I’m not sure how you would like to keep the batch size as 6.
Could you explain your use case a bit?

Since you are now dealing with multi-hot encoded targets (i.e. multi-label classification), you could use nn.BCELoss or nn.BCEWithLogitsLoss instead.

Niki · March 27, 2019, 1:57pm

I think you are right regarding batch size.
Thank you very much for your help, by using this new loss it’s working now.

Mohamad_Hadi · November 22, 2020, 8:46am

Hello, how can convert 16 images with size 128 by 128 into a 512 by 512 image?
That is, 4 in width and 4 in length

ptrblck · November 22, 2020, 8:50am

You could use view or F.fold to create the larger output tensor (depending how the current tensor is shaped one or the other might be easier).

Mohamad_Hadi · November 22, 2020, 9:22am

thanks for your response
If in this case, i want to enter 2 images with size of 512 by 512 in network each time and each image has a different label. Should give batch_size = 32 ?
I am a beginner. excuse me

ptrblck · November 22, 2020, 9:49am

If you convert the 16 images into one large image, I would assume that this large image corresponds to a single target. Could you explain your use case a bit, i.e. how the targets are defined and why you would like to reshape the images?