Hi,
I’m trying to train a model with custom datasets.
I need to select a folder to be sampled from according to probability.
For example, the images should be sampled from folder A with 70% of probability and 30% from folder B.
This is the code I wrote but I think it’s not effective:
class CustomDataset(Dataset):
def __init__(self, img_dir, img_dir2, ratio):
self.ratio = ratio
self.img_list = sorted(glob(os.path.join(img_dir, '**/Image_*'), recursive=True))
self.img_list2 = sorted(glob(os.path.join(img_dir2, '**/Image_*'), recursive=True))
def __getitem__(self, index):
if np.random.random() < self.ratio:
index = np.random.randint(len(self.img_list))
img = Image.open(self.img_list[index])
else:
index = np.random.randint(len(self.img_list2))
img = Image.open(self.img_list2[index])
img = self.preprocess(img)
return torch.from_numpy(img).type(torch.FloatTensor)
def __len__(self):
return len(self.img_list) + len(self.img_list2)
Is there more effective way to perform this task?
I guess I need to use sampler but can’t figure out how to use it.
Thanks in advance