Currently, I have a CSV file containing the directory path of all the images in my dataset, with the class names and the position of the object in each image. Faced with this, I have a question.
When I load my dataset, do I need to do the clippings of the images contained in each bounding box before I start the classification?
I’m little lost with the process from creating the dataset to the image classification.
Let me explain you a little bit how datasets and dataloaders works here.
On one hand, pytorch provides dataloader class. This class requires at least two methods to be defined:
Typically, in your init function you have to create a list(s) of files which will be further loaded but doing no processing over them.
In len you have to return how many elements your dataset have, typically the length of the cited list.
Lastyle, you define getitem. Here you have to code as eficiently as you can how to load a single sample of your dataset. Pytorch is totally agnostic to what you consider as sample. It will just run what you have coded.
On the other hand, we have dataloader. This class stack samples to create batches, do shuffling and multiprocessing. It is agnostic as dataloader just generate a list of indeces, which are ordered by default and contains as many elements as you returned in
__len__. When you shuffle them you are just shuffling those indices.
So in short, you have to define a inherited class based on database model. In
__init__ you have to provide a structure about files to be loaded. In
__getitem__ you will have to load those files and put all the workload such as reading the image, data augmentation, normalization or any other process as the
__getitem__ function is the one which is submitted to multiprocessing.