How to accumulate dataset after applying pytorch transforms on individual datasets

hussain · November 16, 2020, 8:08am

Hi seniors,

I am using a medical dataset, containing 2232 .png images of 256x256 resolution, to train my Pytorch based CNN model from scratch. I performed a vertical and a horizontal flip transforms on dataset separately. Now each original, and vertically and horizontally flipped datasets contain 2232 images.

I want to append (or say, make all dataset in one tensor) all the dataset to make of 3*2232 = 6696 images. A code snippet is given below:

orig_transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=(0.1565,),std=(0.1668,))
    ])
HorizFlip = transforms.Compose([
    transforms.RandomHorizontalFlip(p=1),
    transforms.ToTensor(),
    transforms.Normalize(mean=(0.1565,),std=(0.1668,)),
    
])

VertiFlip = transforms.Compose([
    transforms.RandomVerticalFlip(p=1),
    transforms.ToTensor(),
    transforms.Normalize(mean=(0.1565,),std=(0.1668,)),

])

orig_dataset = MRI_Dataset(csv_file = 'C:/Users/Block-03-EE/AnacondaFiles/RecordWise/tr_file.csv',
                    root_dir = 'C:/Users/Block-03-EE/AnacondaFiles/RecordWise/training_data',
                    # root_dir = './tumor_data/MinMax/original_image',
                    transform =  orig_transform
                    )

verti_dataset = MRI_Dataset(csv_file = 'C:/Users/Block-03-EE/AnacondaFiles/RecordWise/tr_file.csv',
                    root_dir = 'C:/Users/Block-03-EE/AnacondaFiles/RecordWise/training_data',
                    # root_dir = './tumor_data/MinMax/original_image',
                    transform =  HorizFlip
                    )

horiz_dataset = MRI_Dataset(csv_file = 'path/traning_file.csv',
                    root_dir = 'C:/Users/Block-03-EE/AnacondaFiles/RecordWise/training_data',
                    # root_dir = './tumor_data/MinMax/original_image',
                    transform =  VertiFlip
                    )

The MRI_dataset is a class that pics each individual image and it’s corresponding label – with the help of .csv file – form local directory. This code is taken from Alading Persons Youtube video.

I need following magic operation or any idea that would work in my case.

dataset = <magic operation>(orig_dataset, horiz_dataset,verti_dataset)

Please help me in this regards.

Thank you!

PS: I am new to Pytorch

Preetham_R_Patlolla · November 16, 2020, 8:55am

One way is you could use torch.cat. It will let you concatenate tensors along the axis of your choice.

hussain · November 16, 2020, 10:16am

Thank you for the reply.

I tried to do that but following error poped up because I loaded it through the code that I have taken from the YouTube video (above mentioned). When I run torch.cat as you suggested, the following error comes up:

TypeError                                 Traceback (most recent call last)
<ipython-input-155-0d3078352cfb> in <module>
----> 1 data = torch.cat((orig_dataset,verti_dataset,horiz_dataset),0)

TypeError: expected Tensor as element 0 in argument 0, but got MRI_Dataset

the datasaet is of __main__.MRI_Dataset type

hussain · November 16, 2020, 10:17am

FOUND SOLUTION!

Tried to run following command and it worked!

data = torch.utils.data.ConcatDataset((orig_dataset,verti_dataset,horiz_dataset))

Thank you for coming to this post!