Custom Dataset labeling from CSV

Hi,

You can define your own custom dataset class easily to handle this kind of situations.
Here is the top-level structure of the class your can implement:


class PlacesDataset(Dataset):
  def __init__():
     # initialize variables such is path to csv file and images and transforms
  def __len__():
    # here you just need to return a single integer number as the length of your dataset, in your 
    #  case, number of images in your train folder or lines in csv file
  def __getitem__(): 
    # this is the most important part, you need to define a code to read images from folder and
    # labels from csv files and return only a pair of (image, class). Note that here, you just 
    # need to consider 1 sample no more. Let say, you have only 1 image in your whole 
    # dataset, the method will work on batches parallely when you pass it to DataLoader class.

Now, you can do whatever you wanted to do with ImageFolder with this class too.
I know the explanation is too abstract, but this is the whole idea and if you need a real code which works, the link below is mine which uses a csv file to read images and generate labels on the go.

If you had any questions, feel free to ask.

Bests
Nik

2 Likes