Can I train two sequential models in different GPUs?

Suppose I have a model A and a model B,

I want to make the model A and B like a nn.Sequatial() and A and B are in different GPUs, but a single GPU memory is not enough.


Y1 = A(X)
Y2 = Y1.cuda(1)

Z = B(Y2)

Can ‘Z.backward()’ go like Z->Y2->Y1->X ?
And what about optimizers?

1 Like

Yes both autograd and optimizers work.

1 Like