How to use torch.cat to concat pictures belonging to two different folders

suxiaotang · March 5, 2019, 8:11am

I’m doing an image processing task and I want to use torch.cat to concat pictures belonging to two different folders. The size of the images in folder 1 is 224 * 224 * 3, and the size of the images in folder 2 is 224 * 224 * 1.Each folder has 100 images.I want them to be concated in the axial direction and I will use the results as input for deep learning.
How can I write this program?

MariosOreo · March 5, 2019, 9:14am

Hi,

I try to write a snippet and it works on my dataset.

f1_folder = './folder1'
f2_folder = './folder2'

f1_images = glob.glob(os.path.join(f1_folder, '*.png'))
f2_images = glob.glob(os.path.join(f2_folder, '*.png'))

for f1_img, f2_img in zip(f1_images, f2_images):

    img1 = Image.open(f1_img)
    img2 = Image.open(f2_img)

    cat_img = torch.cat((TF.to_tensor(img1), TF.to_tensor(img2)), dim=0)# for 'CHW'

suxiaotang · March 5, 2019, 9:39am

Are there using tensorflow?——“TF.to_tensor(img1)”

MariosOreo · March 5, 2019, 9:40am

Oh, sorry.
I forgot head file.

import glob
import os
import torch
import torchvision.transforms.functional as TF

suxiaotang · March 6, 2019, 7:30am

Thank you very much,it can work now. Could you please tell me how can I input “cat_img“ as an dataset for deep learning? I am a green hand in this field.

MariosOreo · March 6, 2019, 9:16am

I think this thread is what you want, it is a custom dataset example.

You can do some transformation on the image and label, then I think you can cat them together in __getitem__ if you still need.

JuanFMontesinos · March 6, 2019, 10:10am

The main problem is that images in F1 are RGB and images in folder 2 are gray.

You can wheter concatenate them channel-wise or expand gray and cat them in a new dimension.

suxiaotang · March 6, 2019, 12:13pm

Thank you very much.I tried to write in this way,is there anything wrong?Can I now use “cat_img” to train my netural network?

from future import print_function, division
import torch
import torchvision.transforms.functional as TF
from torch.utils.data import Dataset
from PIL import Image

image_paths = ‘F:\picture1’
target_paths = ‘F:\picture2’

class MyDataset(Dataset):
def init(self, image_paths, target_paths, train=True):
self.image_paths = image_paths
self.target_paths = target_paths

def transform(self, image, mask,cat_img):

    
    image = TF.to_tensor(image)
    mask = TF.to_tensor(mask)
    cat_img = torch.cat((image, mask), dim=1)
    return image, mask,cat_img

def __getitem__(self, index):
    image = Image.open(self.image_paths[index])
    mask = Image.open(self.target_paths[index])
    x, y = self.transform(image, mask)
    return x, y

def __len__(self):
    return len(self.image_paths)

MariosOreo · March 6, 2019, 12:19pm

The transform method returns three value and you should use x, y and cat_xy in __getitem__.
And if you need original image, mask and cat_img you can return 3 value in __getitem__.

Then you can pass your custom dataset to a DataLoader.