Advice for model parallel on GANs

mrshenli · February 24, 2021, 3:25am

Hey @shaoming20798, are G_AB and G_BA the two large models you referred to? Will it work if you put your G_AB and G_BA models on two different GPUs, move the computed loss into the same GPU, then compute loss_G on that GPU and run backward from there?

BTW, for distributed training discussions, please consider adding a “distributed” tag. People working on distributed training will actively checking that category.