Data scattering with DistributedDataParallel

I made a typo for passing devices : instead of torch.cuda.set_device(args.local_rank), I passed wrong parameter to torch.cuda.set_device(range(2)).

After fixing this typo, I still have the same problem as posted as How to scatter list data on multiple GPUs

Thanks for any inputs.