Hello. I found that the bottleneck of training procedure of my project to date is the data reading from disk. For a image with size 640480, I just need size of 320240 therefore I use the random_crop. However it will help if I can crop the sam image multi times and pack them up to a batch at the same time. As the get_item method should return tensor with shape CHW and them the dataloader pack up to NCHW, is there a way to pre pack up multi image in the get_item ? Thanks!
you can set batch_size = 1 in
Dataloader, and write your dataset. You may refer to
It’s something like:
def __getitem__(self, index): path, target = self.imgs[index] img = self.loader(path) # ...... for ii in range(batch_size): img_ = random_crop(img) imgs.append(img_) return torch.stack(imgs), target
thanks for you reply. However I need the batch size to be large than 1… And if I write
torch.stack(imgs) in the getitem , the return data shape will be NCHW which should be CHW…
then you also need to write your
def my_collate(batch): imgs,targets = zip(*batch) return torch.cat(imgs),torch.cat(targets)
and use it in dataloader
dataloader = DataLoader(dataset,collate_fn=my_collate)
Thanks! I got your point.