Different batch sizes in each gpu using distributed training

Hi, Distributed Data-Parallel Training is a very useful tool. I am training a network needing different batch_sizes in different gpus. I’d like know if pytorch distributed training supporting different batch sizes in the gpus. Thank you!