Backprop from loss from a different network

What happens when you call step() but the backprop is coming from a different network? The architecture from the 2 networks could be different. Here is some sample pseudocode

net_1 = Net_1()
net_2 = Net_2()
optimizer_1 = Optimizer(net_1.params)
optimizer_2 = Optimizer(net_2.params)
outputs = Net_1(inputs)
loss = CrossEntropyLoss(outputs,targets)

I backprop through net_1, but I am stepping with network 2.


This will use the gradients in network 2 to perform the step.
So if you didn’t use it and didn’t run any backward, these grads will be 0 and this step op will do nothing.