Siamese dataset Builder for large datasets


Disclaimer: I’m new to pytorch.

I’ve been working on building a parallel data loader section (such that only the batchSize images from the workers are loaded onto memory) that feeds into a siamese network.

This is what I have so far:

I wanna go from
dst = SiameseDataset(pos_pairs_csv_path)

dst = SiameseDataset(pos_pairs_csv_path, neg_pairs_csv_path)

where if my batchSize = N, then batchSize = N/2 comes from pos_pairs file and N/2 comes from neg_pairs file.

Thanks in advance for the help!

Wrote something that iterates through pos and neg datasets alternatively.