NetA = Model()
NetB = Model()
optimizer = optimizer.Adam(list(NetA.parameters())+ list(NetB.parameters))
scheduler = scheduler.somescheduler(optimizer)
for epoch in range(epochs):
#Want to copy parameters from NetB to NetA
Hi all. I have two networks and I train them simultaneously. Both of them have the same structure. After every epoch, I want to copy the parameters from NetB to NetA and train the next epoch. I know
torch.state_dict() can create a copy but this may make the optimizer stop working. What is the correct way to only copy the value of parameters from NetB to NetA but doesn’t affect any others?
I would probably use
paramA.copy_(paramB) in a
no_grad context, but it seems also loading the
state_dict works as the parameters are still updated independently:
NetA = nn.Linear(1, 1, bias=False)
NetB = nn.Linear(1, 1, bias=False)
optimizer = torch.optim.Adam(list(NetA.parameters())+ list(NetB.parameters()), lr=1.)
for _ in range(5):
out1 = NetA(torch.randn(1, 1))
out2 = NetB(torch.randn(1, 1))
loss = out1 + out2
a = True
for paramA, paramB in zip(NetA.parameters(), NetB.parameters()):
However, after checking the code it seems the same approach is used internally in
Thanks your suggestions. So these two methods
NetA.load_state_dict(NetB.state_dict) are the same ?
Yes, based on my code snippet and the linked code the same approach would be used.
However, it would be great if you could also verify it with your real model in case I’m missing something in my small example.