I am quite new to pytorch so please bear with me here. I have a CSV dataset which looks like so:
where the class_label is my
target and the
image_location is my input to NN.
I would like to use a dataloader to
somehow split this into train and test sets, with a stratified sampling of each class in the CSV (20 classes, 4
images per class in the CSV).
When googling around, I see some pandas and scikit-learn solutions, so, beginning with something like:
sss = StratifiedShuffleSplit(df['event'], n_iter=1, test_size=0.2)
which I presume gives you a single train and test split which I can use with
dataSet later batch with
DataLoader. However, I am not sure if this is an elegant solution and I was wondering if someone could point me in the right direction to schieve this.