In DataLoader
, we could specify the worker_init_fn
argument to change the seed accordingly. This however requires the worker to call the init function, effectively only during initialization of the worker. Is there any way I could change the seed of worker during runtime? For example, changes at every n minibatches?
Thanks in advance!
Yes, you can set the seed via torch.manual_seed
inside the Dataset.__getitem__
and could also use the worker info as seen here:
class MyDataset(Dataset):
def __init__(self):
pass
def __getitem__(self, idx):
worker_info = torch.utils.data.get_worker_info()
if worker_info is not None:
print(worker_info)
worker_id = worker_info.id
torch.manual_seed(worker_id)
return torch.randn(1)
def __len__(self):
return(20)
dataset = MyDataset()
dataloader = DataLoader(dataset,
batch_size=5,
shuffle=False,
num_workers=8)
for data in dataloader:
print(data)
2 Likes
Huh, never cross my mind hijacking the __getitem__
. Thanks @ptrblck, the Twitter famous celebrity 
1 Like