DDP on 8 GPUs vs. Single GPU training speed

Maybe you can find a solution together: