Uneven workloads for multi-GPU training

Hi, can Pytorch do unevenly split during multi-GPU training, say GPU 0 is much faster than GPU1, then it’s better to split the samples to 7:3 other than 1:1.

Thanks.