Setup models on different gpus and use dataparallel

Lets say I have 2 models A() and B() and 2 gpus. Outputs of A will be fed to B as inputs

Because 2 models are too big to fit on the same gpus so I have to manually instantiate A on gpu 0 and B on gpu 1. Hence, I have to manually change the device of A’s output to feed to B.

Sometimes, my batchsize is too large to run A() on gpu 0, but if I theoretically can utilize gpu1, I can still use that batch size without reducing it.

The question is can my models be placed on different gpus but still run in dataparallel mode?

Update:
I saw a post mentioning about this:

model1 = nn.DataParallel(model1).cuda(device=0)
model1_feat = model1(input_image)

model2 = nn.DataParallel(model2).cuda(device=1)
model2_feat = model2(model1_feat, input_feat)

My question is does that mean model1 is replicated on both gpu?

Yes, DataParallel will replicate the model, scatter inputs, and gather outputs in every iteration. So, for the above code snippet, everytime you run model1_feat = model1(input_image), model1 is replicated to all devices in the forward pass.