Hello. I found that the bottleneck of training procedure of my project to date is the data reading from disk. For a image with size 640480, I just need size of 320240 therefore I use the random_crop. However it will help if I can crop the sam image multi times and pack them up to a batch at the same time. As the get_item method should return tensor with shape CHW and them the dataloader pack up to NCHW, is there a way to pre pack up multi image in the get_item ? Thanks!
1 Like
you can set batch_size = 1 in Dataloader
, and write your dataset. You may refer to
It’s something like:
def __getitem__(self, index):
path, target = self.imgs[index]
img = self.loader(path)
# ......
for ii in range(batch_size):
img_ = random_crop(img)
imgs.append(img_)
return torch.stack(imgs), target
thanks for you reply. However I need the batch size to be large than 1… And if I write torch.stack(imgs)
in the getitem , the return data shape will be NCHW which should be CHW…
then you also need to write your collate_fn
something like
def my_collate(batch):
imgs,targets = zip(*batch)
return torch.cat(imgs),torch.cat(targets)
and use it in dataloader
dataloader = DataLoader(dataset,collate_fn=my_collate)
2 Likes
Thanks! I got your point.