What should I output when defining an image-to-value dataset?

This is the first time I define my own dataset
I want to define a dataset, each entry of which is one image associated with a value.
When I define the dataset, should I define the self.samples as an array of:
one string containing the image disk path, and one floating value
the real image RGB value matrix and one floating value
When you call the dataset use the following syntax:

train_data=torchvision.datasets.MYDATA('../mydata', train=True, download=True,
                       torchvision.transforms.Resize((h_image, w_image)),


train_data=MYDATA('../mydata', train=True, download=True,
                       torchvision.transforms.Resize((h_image, w_image)),

What conditions do you need to meet to use the transforms utilities? like: rotate and flip?
Hope I described my question clearly.

I would recommend to stick to the first approach of storing the paths only and lazily load the samples in the __getitem__ method as it will save memory.
You would then have to initialize your dataset using its __init__ method so the second call looks more correct although you would have to make sure all arguments are really expected (e.g. the download argument might not be needed).

So, you are suggesting the following?:
Define Data:

class MYDATA(Dataset):
    def __init__(self, root='folder_path', parm1, parm2, parm3, parm4, train=True, data_augmentation=False)
        super(MYDATA, self).__init__()
        if parm1:
        if parm2:
        if parm3:
        if parm4:
        if train:
        #after a bunch of no_matter_what operations
        self.samples = [(path, value), ...]

    def __len__():
        return len(self.samples)

    def __getitem__(self, idx):
        image = PIL.Image.open(self.samples[idx][0])
        imagearray = numpy.asarray(image)
        return (imagearray, sefl.samples[idx][1])

Get Data:

train_data=MYDATA('../mydata', train=True, parm1=parm1,
                       torchvision.transforms.Resize((h_image, w_image)),

What I understood now is:
1, I can prepare my data in init() according to some conditions, for example: parm1, parm2, …

What don’t understand is:
1, When is the getitem() called, because it is never been called explicitly in the Get Data code
2, I never define any behavior of transforms in MYDATA class, can I still use the torchvision.transforms tools?
3, If I want to do data_augmentation, for example, add some rotated and flipped copy of the same image, and associated them with the same value, should I do it explicitly in the init()?
4, Can torchvision.transforms handle the data_augmentation operation for me? or it can only change (rotate, flip) and replace the original data entry instead of adding some new entries?
5, If I defined all things in MYDATA class init(), including data augmentation, and defined the converting image path to array in getitem(), do I still need to use those torchvision.transforms calls? What the torchvision.transforms.ToTensor() did exactly?

I think that is all questions I can think of right now. (It is alot, ^_^)

Thank you very much!

  1. The __getitem__ function is called then the Dataset is indexed either directly:
x, y = dataset[index]

or from the DataLoader by e.g. iterating it:

for data, target in loader:

Internally the DataLoader will use the sampler to create indices and index the internal Dataset with it.

  1. Yes, you should pass the transformations as an object to the Dataset.__init__ method and use it in the __gettitem__. Usually something like this is used:
def __init__(self, transform=None):
    self.data = ...
    self.transform = transform

def __getitem__(self, index):
    x = self.data[index]
    if self.transform:
        x = self.transform(x)
    return x, y
  1. This would be a valid approach, but usually data augmentation is done on-the-fly and a single sample is returned in the __getitem__ method.

  2. Yes, torchvision.transforms provide also transformations for rotation etc.

  3. Yes, you need to call the transformations in the __getitem__.

Thanks! :grinning: :grinning: :grinning: :grinning: :grinning: :grinning: :grinning: :grinning:

If I want to use data.to(device) later to move the data into GPU, do I need to write anything in MYDATA class? Thanks.

No, the common approach is to move the data to the GPU in the DataLoader loop.

