I’m using data loader to train a model. Unlike usual data loader, which load one image at a time, and one batch of images at one batch, I want to load a sequence of images and possibly a sequence of files all together. This means that a batch contains num_of_batch*num_of_images_per_batch images. Naively we can use a for loop inside get item function, but python’s for loop is very slow. Is there anyway to improve?
Would it work if you set the
num_of_batch*num_of_images in your
DataLoader and just reshape the batch after the loading, i.e.:
for data, target in loader: data = data.view(num_of_batch, num_of_images, channels, w, h) target = ...
In case it’s not possible, I would try to load the files sequentially and see if it’s really the bottleneck of your training procedure.