I’m trying to implement a network in which I am trying to run the network twice in one epoch with different sets of parameters. The workflow goes like the following (G being the network and P1 and P2 being subset of parameters of G):

G(with P1 set of parameters) --> Loss Calculation --> G.backwards() --> G(exchange P1 with P2) --> Loss Calculation --> G.backwards() --> optimizer.step()

P1 and P2 are mutually exclusive and get updates by the different set of gradients while rest of the parameters get updates by the combined backprop. How should I go about it?