I want to write a code in by Pytorch that concatenate two images (32 by 32), in the way the output image becomes (64 by 32), how should I do that?
Thank you~
I want to write a code in by Pytorch that concatenate two images (32 by 32), in the way the output image becomes (64 by 32), how should I do that?
Thank you~
torch.cat
should do the job:
a = torch.randn(3, 32, 32)
b = torch.randn(3, 32, 32)
c = torch.cat((a, b), 1)
print(c.shape)
> torch.Size([3, 64, 32])
Thank you for your reply. I saw the torch.cat but donāt know how I can apply it here:
import torch
import torchvision
import torchvision.transforms as transforms
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainset = torchvision.datasets.CIFAR10(root=ā./dataā, train=True,
download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=5,
shuffle=True, num_workers=1)
classes = (āplaneā, ācarā, ābirdā, ācatā,ādeerā, ādogā, āfrogā, āhorseā, āshipā, ātruckā)
for epoch in range(3):
for i, data in enumerate(trainloader, 0):
inputs, labels = data
#Final_input=image 64 by 32
You are currently using a batch size of 5, which wonāt work if you would like to concatenate two images.
If you use an even batch size, you could concatenate the images using this code:
inputs = torch.cat((inputs[::2], inputs[1::2]), 2)
Since you are using shuffle=True
, I assume the pairs used to create the larger tensors do not matter. Is this correct or would you like to concatenate specific pairs of image tensors?
Thank you very much! Itās working. No, I donāt consider specific pairs.
If I want to concatenate labels at the same time, this is correct?
labels = torch.cat((labels, labels))
Not really, as your samples not contain more than a single class and you are now dealing with a multi-label classification use case.
In this setup, you could create multi-hot targets and change the criterion to e.g. nn.BCEWithLogitsLoss
:
one_hot_labels = torch.nn.functional.one_hot(labels)
one_hot_labels = one_hot_labels[::2] | one_hot_labels[1::2]
one_hot_labels = one_hot_labels.float()
I tried that, itās getting me this error:
AttributeError: module ātorch.nn.functionalā has no attribute āone_hotā
the torch version is 1.0.1.post2
Iām using 1.0.0.dev20190312
. You could use the nightly build or if you prefer to stay in 1.0.1post2
, you could use this code to create the one-hot encoded targets:
one_hot_labels = torch.zeros(labels.size(0), nb_classes, dtype=torch.long).scatter_(1, labels.unsqueeze(1), 1)
Thank you! Thatās work, but I realized the āone_hot_labelsā at the end of the loop has a size [2,10], itās possible that has size [2,10] inside the loop?
If I want to concatenate more than two images, which parts of inputs and one_hot_labels should be changed?
A more general approach could use torch.split
and a better .scatter_
call:
nb_cat = 3 # Number of images to concatenate
for i, data in enumerate(trainloader, 0):
inputs, labels = data
batch_size = inputs.size(0)
split_size = batch_size // nb_cat
inputs = torch.cat(inputs.split(split_size), 2)
labels = torch.stack(labels.split(nb_cat))
one_hot_labels = torch.zeros(inputs.size(0), nb_classes, dtype=torch.long).scatter_(1, labels, 1)
one_hot_labels = one_hot_labels.float()
EDIT: Always assuming the batch size if divisible by nb_cat
without a remainder.
I havenāt tested edge cases!
nb_cat = 3 # Number of images to concatenate
nb_classes=10
for epoch in range(2):
for i, data in enumerate(trainloader, 0):
inputs, labels = data
batch_size = inputs.size(0)
split_size = batch_size // nb_cat
inputs = torch.cat(inputs.split(split_size), 2)
labels = torch.stack(labels.split(nb_cat))
one_hot_labels = torch.zeros(inputs.size(0), nb_classes, dtype=torch.long).scatter_(1, labels, 1)
one_hot_labels = one_hot_labels.float()
this gives me this error:
RuntimeError: split_size can only be 0 if dimension size is 0, but got dimension size of 2
It seems your batch size is smaller than nb_cat
, thus the split_size
is 0.
This might be the case if you set a smaller batch_size
in your DataLoader
or alternatively the last batch in the training loop might be smaller, if the number of samples is not divisible by the batch size without a remainder. You could avoid this issue if you set drop_last=True
in your DataLoader
(or handle the smaller batch in another way).
I added drop_last=True and itās working now, for batch_size=6, nb_cat=2, nb_classes=10, epoch=3:
print(inputs.shape): torch.Size([3, 3, 64, 32])
print(one_hot_labels.shape): torch.Size([3, 10])
here (in size) the first element (3) is split_size, am I right? Is there any ways that be dependent on batch_size, like Size([6, 3, 64, 32]) or Size([6, 10])?
when I use these inputs and one_hot_labels in my code, in the calculating loss part, shows me this error:
RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 ātargetā
this is because of float in one_hot_labels? how can I solve it?
If you concatenate the images, youāll get ālessā samples, so Iām not sure how you would like to keep the batch size as 6.
Could you explain your use case a bit?
Since you are now dealing with multi-hot encoded targets (i.e. multi-label classification), you could use nn.BCELoss
or nn.BCEWithLogitsLoss
instead.
I think you are right regarding batch size.
Thank you very much for your help, by using this new loss itās working now.
Hello, how can convert 16 images with size 128 by 128 into a 512 by 512 image?
That is, 4 in width and 4 in length
You could use view
or F.fold
to create the larger output tensor (depending how the current tensor is shaped one or the other might be easier).
thanks for your response
If in this case, i want to enter 2 images with size of 512 by 512 in network each time and each image has a different label. Should give batch_size = 32 ?
I am a beginner. excuse me
If you convert the 16 images into one large image, I would assume that this large image corresponds to a single target. Could you explain your use case a bit, i.e. how the targets are defined and why you would like to reshape the images?