Lets say I have 2 models A() and B() and 2 gpus. Outputs of A will be fed to B as inputs
Because 2 models are too big to fit on the same gpus so I have to manually instantiate A on gpu 0 and B on gpu 1. Hence, I have to manually change the device of A’s output to feed to B.
Sometimes, my batchsize is too large to run A() on gpu 0, but if I theoretically can utilize gpu1, I can still use that batch size without reducing it.
The question is can my models be placed on different gpus but still run in dataparallel mode?
Update:
I saw a post mentioning about this:
model1 = nn.DataParallel(model1).cuda(device=0)
model1_feat = model1(input_image)
model2 = nn.DataParallel(model2).cuda(device=1)
model2_feat = model2(model1_feat, input_feat)
My question is does that mean model1 is replicated on both gpu?