Split a model and process in multiple stage with single GPU

Hi,

I am recently working with one huge model.
I am really curious can we split this model into several submodules and we process it with one single GPU in multi-stage?
For example, we iteratively put one submodule on GPU and forward, then we fetch it back and put another submodule on GPU.
Will backward work with this pipeline? I guess it works for forward mode.

Thanks.

Do you think checkpoint will work for you?
https://pytorch.org/docs/stable/checkpoint.html

Hi,thanks for the reply.
I think it works, but I am not quite familiar with this feature and I did not learn much from the document. Could you please provide an example or a tutorial?

@Priya_Goyal has created a nice tutorial here.

Thanks a lot for the tutorial.
To justify that I understand correctly:

The model consists of a transfer model ( e.g. CycleGAN) and two downstream networks.
Now we divide the model into 4 submodules, let’s say G_A, G_B, M_A, M_B.
The custom function should be:

def forward(input_A, input_B):
    out_A1 = M_A(input_A)
    out_A2 = M_A(G_A(input_A))
    out_B1 = M_B(input_B)
    out_B2 = M_B(G_B(input_B))

If we use the checkpoint, shall we define 4 run_function as following?

def forward(input_A, input_B):
    out_A1 = checkpoint.checkpoint(run_function_MA1, input_A)
    out_A2 = checkpoint.checkpoint(run_function_MA2, input_A)
    out_B1 = checkpoint.checkpoint(run_function_MB1, input_B)
    out_B2 = checkpoint.checkpoint(run_function_MB2, input_B)

Thanks