I am trying to retrieve a sequence of bounding boxes to train an RNN, to do so I defined a Dataloader like this :
I load a sequence of tensors, where tensors are the bounding boxes that have been resized to fit VGG16, from memory which is a quick operation so I thought that doing a Dataloader like this would be quicker than loading an image then cropping it to make a bounding box than resizing it for every call to get_item. The problem is that for some reason getting a batch takes approx. 10 seconds with num_workers =1 and ±2 minute with num_worker =10.
So I tried to manually make a batch by doing :
batch = 
for i in range(64):
det_frame_list = traindata[i]
label = trainlabel[i]
bbs = 
for j in range(len(det_frame_list)):
tensor_ret = torch.stack(bbs)
batch = torch.stack(batch)
(don’t know how to use BBCode sorry!)
to see if my approach was slow but it instantly finishes with good output.
Since I am new to Pytorch I can’t figure out why this (Dataloader taking ages to make a batch) is happening, any help understanding this would be greatly appreciated!