Train two models together problem: stop one and keep training the other?

Hi! I am trying to train two models. The problem is a regression; setup is as follows:

network F and G. F turn input x into feature vector F(x), G takes F(x) and give G(F(x))
This leads to some problems:
1 since G’s input = F’s output, when F converges, I need to stop F and keep training G
the way I ‘lock’ F is: put the loss.backward and optimizer.step under an if() statement
Does this do what I think it does? (locks F, keep training G)
2 following the rule ‘total parameters = 2 * number of samples + dimension of input’, should ‘total parameters’ apply to ‘parameters of F + parameter of G’ or F and G separately? (i.e. if total parameters = 10, should F and G have 10 params each, or should their number of params add up to 10?)

Any answers, critiques, and questions would be helpful!
Many Thanks!

Re: 1 Yes, that should work, but you should also call model.eval() to prevent e.g., any normalization layers from having their running stats updated and wrap the frozen model’s forward execution in with torch.no_grad(): to save memory.

Re 2: I’m not sure there are any hard and fast scaling rules like that, especially without knowledge of what domain you are currently working in. In general, most deep learning models these days err on the side of overparameterization.

Hi eqy,

Thanks for your help!
1 I am adding these (no_grad() and eval()) to my code!

2 My input is a very long vector; my output is one number. Currently my F network has more than enough parameters, but my G network is ‘under-parametrized’. Increasing G’s parameters doesn’t seem to improve the results