Hi! I have a question about the usage of SubsetRandomSampler
.
I have multiple dataloader in my code and say it’s defined this way:
gen = torch.Generator()
gen.manual_seed(seed)
data_loader = DataLoader(train_set,
batch_size=int(args.batch_size),
shuffle=False,
sampler=torch_sampler.SubsetRandomSampler(range(100)),
generator=gen)
eval_loader = DataLoader(eval_set,
batch_size=args.batch_size,
shuffle=False,
drop_last=False,
pin_memory=True)
The first loader is used before trainin and testing, the latter is used in testing. If I exclude testing in each epoch, with a fixed seed, the first dataloader gives the same batch order per epoch. But if I include testing in each epoch with a same seed, the first dataloader will give a different batch order.
It’s still the case even if I set the generator in data_loader
. So what happened to the SubsetRandomSampler when I do testing right after it in every epoch?
@ptrblck sorry to bother you again, but could you please help me look at this?