How to set the same random seed for all workers?

odats · August 10, 2020, 8:56am

Without setting a random seed the data loader returns the same random data for each epoch:
epoch 1: worker1->[2], worker2->[2], epoch 2: worker1->[2], worker2->[2],...

When I set a random seed in worker_init_fn function I get random data for each worker:
epoch 1: worker1->[2], worker2->[4], epoch 2: worker1->[7], worker2->[6],...

How to set a random seed to get a new random data each epoch but the same random data for each worker?
epoch 1: worker1->[2], worker2->[2], epoch 2: worker1->[7], worker2->[7],...

def worker_init_fn(worker_id):                         
    print(torch.utils.data.get_worker_info().seed)
    print(torch.initial_seed())
    # set seed
    np.random.seed(int(torch.utils.data.get_worker_info().seed)%(2**32-1))

train_loader = DataLoader(DatasetRandom(), batch_size=2, num_workers=4, worker_init_fn=worker_init_fn)

I use Ignite for training.

vfdev-5 · August 10, 2020, 11:40am

@odats can you set a seed each epoch for that ?

@trainer.on(Events.EPOCH_STARTED)
def set_epoch_seed():
    set_seed(trainer.state.epoch)

If this does not work for you, please provide a minimal code snippet to see the problem.

odats · August 10, 2020, 12:49pm

set_seed is no defined, where I can find this method?

vfdev-5 · August 10, 2020, 1:01pm

You can use ignite.utils.manual_seed, but I wanted to say that set the seed of your random generator.

odats · August 10, 2020, 1:33pm

@trainer.on(Events.EPOCH_STARTED)
def set_epoch_seed():
    ignite.utils.manual_seed(trainer.state.epoch)

Yes, it works. But it has 2 issues:

validation data loader returns the same random values as training loader
it always returns the same values (because seed values are always the same 1,2,3)

vfdev-5 · August 10, 2020, 2:07pm

Please, provide a minimal code snippet to run it and see the problem in details

odats · August 10, 2020, 2:33pm

class DatasetRandom(Dataset):   
    def __len__(self):
        return 8
    def __getitem__(self, idx):
        #print(torch.utils.data.get_worker_info().seed)
        return (np.random.randint(1000, size=1), idx%2)

def worker_init_fn(worker_id):                         
    #print(torch.utils.data.get_worker_info().seed)
    #print(torch.initial_seed())
    np.random.seed(int(torch.utils.data.get_worker_info().seed)%(2**32-1))

@trainer.on(Events.EPOCH_STARTED)
def set_epoch_seed():
    #print('seed', torch.initial_seed())
    # manual_seed(trainer.state.epoch)
    manual_seed(int(torch.initial_seed())%(2**32-1))

train_loader = DataLoader(DatasetRandom(), batch_size=1, num_workers=2)
val_loader = DataLoader(DatasetRandom(), batch_size=1, num_workers=2, worker_init_fn=worker_init_fn)

add worker_init_fn=worker_init_fn to val_loader
manual_seed(int(torch.initial_seed())%(2**32-1))

odats · August 10, 2020, 2:34pm

I have found issue and updated my code. I have posted working solution. Thank you for the support.

vfdev-5 · August 10, 2020, 2:49pm

I have found issue and updated my code. I have posted working solution. Thank you for the support.

@odats great !

However, seeing your DatasetRandom implementation, it seems a bit weird to have the same random data each epoch as you said it in the very begining:

Without setting a random seed the data loader returns the same random data for each epoch:
epoch 1: worker1->[2], worker2->[2], epoch 2: worker1->[2], worker2->[2],...

Probably, somewhere you should have some unwanted random seed synchonization. Which ignite version you are using, btw ?

odats · August 10, 2020, 2:58pm

It is about pytorch. I thought Ignine has some elegant solution for this behavior. As you suggested add callback @trainer.on(Events.EPOCH_STARTED)

class DatasetRandom(Dataset):    
    def __len__(self):
        return 2
    def __getitem__(self, idx):
        return (np.random.randint(1000, size=1), idx%2)

train_loader = DataLoader(DatasetRandom(), batch_size=1, num_workers=2)

for e in range(2): 
    print('epoch',e)
    for i, (images, labels) in enumerate(train_loader):
        print(i, images, labels)

epoch 0
0 tensor([[492]]) tensor([0])
1 tensor([[492]]) tensor([1])
epoch 1
0 tensor([[492]]) tensor([0])
1 tensor([[492]]) tensor([1])

Ignite: 0.4.1, PyTorch: 1.4.0

vfdev-5 · August 10, 2020, 3:28pm

I thought Ignine has some elegant solution for this behavior.

In v0.3.0 we had something similar to set_epoch_seed in the Engine automatically, but we found that it has a lot of side effects like you see with evaluator etc.

vfdev-5 · August 10, 2020, 3:46pm

Btw, seeing the implementation of DataLoader’s worker loop : https://github.com/pytorch/pytorch/blob/23db54acdf455866206d2de36c33fb75e177cb4c/torch/utils/data/_utils/worker.py

A seed is setup for torch and python random (not numpy random) to randomize data each time dataloader iterator is created, so if you replace your np.random.randint(1000, size=1) by random.randint(0, 1000), data will be random for each epoch.

odats · August 10, 2020, 4:17pm

Thanks, but for numpy I should go with original solution?

vfdev-5 · August 10, 2020, 4:56pm

It depends on what is behind np.randint calls in your real case. I was thinking about data augmentations that can be parametrized with numpy, so if you have a control of that it would be more simple to replace randomness generation…