Hi! I am trying to train two models. The problem is a regression; setup is as follows:

network F and G. F turn input x into feature vector F(x), G takes F(x) and give G(F(x))

This leads to some problems:

1 since G’s input = F’s output, when F converges, I need to stop F and keep training G

the way I ‘lock’ F is: put the loss.backward and optimizer.step under an if() statement

Does this do what I think it does? (locks F, keep training G)

2 following the rule ‘total parameters = 2 * number of samples + dimension of input’, should ‘total parameters’ apply to ‘parameters of F + parameter of G’ or F and G separately? (i.e. if total parameters = 10, should F and G have 10 params each, or should their number of params add up to 10?)

Any answers, critiques, and questions would be helpful!

Many Thanks!