Fix random seed per validation sample but unfix it in training

manurare · October 28, 2022, 10:44am

Hi I have a custom dataset class with the following structure. Both the the validation and training datasets are created using this class.

class CustomDataset(data_utils.Dataset):
    def __init__(self, seed_per_sample=False):
        self.image_files = []
        self.mask_files = []
        self.seed_per_sample = False if not seed_per_sample else True 
        ...

    def __getitem__(self, index):
        img = Image.open(self.image_files[index])
        if self.seed_per_sample:
            random.seed(index)
        else:
            random.seed()
        mask_files = random.sample(self.mask_files, 3)
        ...

train_dataset = CustomDataset(seed_per_sample=False)
val_dataset = CustomDataset(seed_per_sample=True)

I want the sampled masks in the validation to always be the same at each sample. However I want random masks during training. This is the only solution I could come with. However, this makes reproducibility impossible because I am always re-setting the seed on each call to __getitem__. How could I avoid this and randomize the sampling in training while keeping it the same in validation?