Copy a model's gradients to another model

Hi all,

I am trying to implement the communication of gradients between two models with same architecture, like copying model A’s gradients to model B.

In model A, I use:
gradients = {}
inputs =
labels =
outputs = modelA(inputs)
loss = criterion(outputs, labels)
for name, param in modelA.named_parameters():
gradients[name] = param.grad.clone()

In model B, I use:
for name, param in modelB.named_parameters():
param.grad = gradients[name]

I am not sure if there is any error in this part, but the testing accuracy in cifar10 dataset remains about 10% ~ 15% after 50 epochs. Really need help from someone.

The core looks correct (at least I don’t see any obvious issues).

Could you explain why this approach should work at all?
Assuming both models are using different parameters, I would expect to see a failure in modelB’s training if modelA’s gradients are used.

Sorry, I forgot to mention that I would synchronize the weights of these two parameters after each epoch training, by passing modelB’s state_dict to modelA, and modelA will load the state_dict. So, if modelA can converge to a minimum point of the loss function, modelB could converge to the same point. If my mind is correct, I should that should work. I am sure that modelA can reach at least 70% testing accuracy if I only train it solely.

By the way, I define modelA and modelB, optimizerA and optimizerB as global variables in two different python files. Would that be the possible problem?