Joining multiple neural nets in non-sequential model

Hadi_Nayebi · May 23, 2021, 12:48am

Hey, I hope everyone is having a great day.
I have the following setup:

state = netA(input)
output1=netB(state)
output2=netC(state)

I can define target values for output1 and output2. How would you approach training the netA?
I am trying to combine the gradient on the input of netB and netC (state) as the loss to train netA.
or should I use them separately?!!!
how can I get the gradient on the input of netB and netC?

Best,
hn

ptrblck · May 23, 2021, 8:11am

You could calculate the losses from using output1 and output2, accumulate them, and call loss.backward() to calculate the gradients for all 3 models. If you don’t want to train netB and netC, you could set the .requires_grad attribute if their parameters to False.

In case you want to get the gradients in state, you could call state.retrain_grad(), which would then allow you to access its .grad after the backward call.

Hadi_Nayebi · May 31, 2021, 2:28am

Thank you ptrblck. I will try your suggestions and get back to you if I had more questions, I am a newbie, and things are slow on my side.