I have two networks, “net1” and "net2"
Let us say “loss1” and “loss2” represents the loss function of “net1” and “net2” classifier’s loss.
lets say “optimizer1” and “optimizer2” are the optimizers of both networks.
“net2” is a pretrained network and I want to backprop the (gradients of) the loss of “net2” into “net1”.
loss1=…some loss defined
So, loss1 = loss1 + loss2 (lets say that loss2 was defined initially)
So I do
loss1.backward(retain_graph=True) #what happens when I write this
What is the difference between backward() and step() ???
If I do not write loss1.backward() what will happen ??