Hi all, I`m confused that three types of optimizer expressions.
1. optimizer = torch.optim.Adam(list(model1.parameters()) + list(model2.parameters()), ...)
vs
2. optimizer = torch.optim.SGD([{'params': model1.parameters()},
{'params': model2.parameters()}], ...)
vs
3. optimizer = torch.optim.Adam(whole_network.parameters(), ...)
(whole_network include model1 and model2)
those three optimizesr work same ? Anyone can compare those optimizers ?