DataParallel with manual scatter

With DataParallel, how can we assign exemples to GPUs manually while iterating on data loader?

My dataset contains images of highly variable sizes and we chose to use a batchsize of 1. The automatic scatter in dataParallel will use the batchsize dimension to realize the scatter and will only assign to 1 GPU in this case.

Is there a way to compute compute the backward in multi-GPU fashion in this context?

Do you want to try to use DistributedDataParallel API, where you can spawn each process running on one GPU