I have a dataset which has images and corresponding texts.
I want to cut part of a dataset to get input data in a shape of 4 sequence images with corresponding texts.
like data = [ [image1, image2, image3, image4] ,[text1, ...text4] ]
So, if I have 6 images, I can make 3 inputs data.
my dataset loader is like below
class dataset(Dataset):
def __init__(self, path, start, end):
self.path = path
self.data = self.getsubsequnce(start, end)
def _getMegabatch(self, start, end):
file = h5.File(self.path)[start:end]
return file
def _getsubsequence(self,start,end):
megabatch = self._getMegabatch(start, end)
"""
returns set of 4 length images
"""
def __len__(self):
return len(self.data)
def __getitem__(self, index):
data = self.data
return data[index]
I’m wondering this is a common way of generating mini-batches from mega-batch data.
Could you give me an advice??