From what I tested, DDP still runs for a single GPU, and essentially the DistributedSampler makes the dataset split by a single GPU - meaning that we get the same dataset unchanged. With this being said, are there any benefits to using DDP on a single GPU over the normal method?
1 Like
No, I don’t think you should expect to see any benefits in launching DDP
on a single GPU only, but would also assume to see no difference (between source code changes) to a standalone single-GPU script.
1 Like