Can I store intermediate tensors to divide the whole network training process to some small process?

Kevinkevin189 · November 29, 2018, 1:14pm

I do not know if it can. I want to train a network in two process, the former sub network and the latter sub network.The latter receives a tensor from the former.First I train the former network to a stable state.
Instead of freezing the paramters by requires_grad=False or specify the trainale parameters, I’d like to save the former’s output tensor ,such as .pth, .h5, .npy, so that I can train the latter without a long and redundant forward\backward tensor flow.
I came across the situation that my former network consists of too many nn.Sequential ,and I do not put them into one class ,such as class FormetNet(nn.Module),I want to train the latter one but it accept a tensor from former’s output,so I must let the data flows to former network first.Is there any idea about how to train the latter without too much change?

vmirly1 · December 4, 2018, 8:40pm

As you said, you can save the outputs of the FormerNet on disc (if it is not too much), and assuming the required time for IO is less than evaluating the FormerNet.

I have done something similar where I saved the outputs of a VGG network (which were tensors of size 2062) in « .h5 » format. This allowed me to save GPU memory and increase computation speed.