I am creating my own dataset class which takes a path to a csv file containing image paths and labels:
class MyDataset():
def __init__(self, csv_path):
...
def __len__(self):
return len(...)
def __getitem__(self, index):
...
return (img, label)
As you can see it contains the methods __init__
, __getitem__
and __len__
.
Now, this is the source code for the Pytorch’s Dataset:
class Dataset(object):
def __getitem__(self, index):
raise NotImplementedError
def __add__(self, other):
return ConcatDataset([self, other])
and in docs it says that:
All datasets that represent a map from keys to data samples should subclass it
But I don’t see anything special except the __add__
method which I think in my case is not needed (otherwise I could write my own). Is it still necessary to inherit from Dataset after having implemented my own __getitem__
and __len__
, to be able to create a dataloader later on? What advantage is there from subclassing it?