I’m doing an image processing task and I want to use torch.cat to concat pictures belonging to two different folders. The size of the images in folder 1 is 224 * 224 * 3, and the size of the images in folder 2 is 224 * 224 * 1.Each folder has 100 images.I want them to be concated in the axial direction and I will use the results as input for deep learning.
How can I write this program?
Hi,
I try to write a snippet and it works on my dataset.
f1_folder = './folder1'
f2_folder = './folder2'
f1_images = glob.glob(os.path.join(f1_folder, '*.png'))
f2_images = glob.glob(os.path.join(f2_folder, '*.png'))
for f1_img, f2_img in zip(f1_images, f2_images):
img1 = Image.open(f1_img)
img2 = Image.open(f2_img)
cat_img = torch.cat((TF.to_tensor(img1), TF.to_tensor(img2)), dim=0)# for 'CHW'
Are there using tensorflow?——“TF.to_tensor(img1)”
Oh, sorry.
I forgot head file.
import glob
import os
import torch
import torchvision.transforms.functional as TF
Thank you very much,it can work now. Could you please tell me how can I input “cat_img“ as an dataset for deep learning? I am a green hand in this field.
I think this thread is what you want, it is a custom dataset example.
You can do some transformation on the image and label, then I think you can cat them together in __getitem__
if you still need.
The main problem is that images in F1 are RGB and images in folder 2 are gray.
You can wheter concatenate them channel-wise or expand gray and cat them in a new dimension.
Thank you very much.I tried to write in this way,is there anything wrong?Can I now use “cat_img” to train my netural network?
from future import print_function, division
import torch
import torchvision.transforms.functional as TF
from torch.utils.data import Dataset
from PIL import Image
image_paths = ‘F:\picture1’
target_paths = ‘F:\picture2’
class MyDataset(Dataset):
def init(self, image_paths, target_paths, train=True):
self.image_paths = image_paths
self.target_paths = target_paths
def transform(self, image, mask,cat_img):
image = TF.to_tensor(image)
mask = TF.to_tensor(mask)
cat_img = torch.cat((image, mask), dim=1)
return image, mask,cat_img
def __getitem__(self, index):
image = Image.open(self.image_paths[index])
mask = Image.open(self.target_paths[index])
x, y = self.transform(image, mask)
return x, y
def __len__(self):
return len(self.image_paths)
The transform method returns three value and you should use x, y and cat_xy in __getitem__
.
And if you need original image, mask and cat_img you can return 3 value in __getitem__
.
Then you can pass your custom dataset to a DataLoader.