I have two networks: net1 and net2. They are different learning rate schedule but joint training.
net1_optimizer = torch.optim.Adam(net1.parameters()), lr=args.lr, betas=(0.5, 0.999))
net2_optimizer = torch.optim.Adam(net2.parameters(), lr = args.lr)
net1_lr_scheduler = torch.optim.lr_scheduler.StepLR(net1_optimizer, step_size=100, gamma=0.1)
net2_lr_scheduler = torch.optim.lr_scheduler.StepLR(net2_optimizer, step_size=100, gamma=0.1)
loss2. The total loss is
loss_total = loss1+loss2
step, should I call them separately for
So is it correct when I used
It’s correct and necessary.
If the question is about the
step method of the learning rate scheduler (in this case
StepLR), then you should also call the schedulers’
step method, which is different from that of the optimizers.
Since you say that they have joint training, it might be preferable to use one optimizer with the parameters from both nets, but keep separate lr schedulers as you did, then call
as necessary (i.e. at each epoch or iteration, depending on how you want to count them. See the example here, which does not include the optimizer
However, I can’t see how the schedules are different for both nets, by reading your code, since they have the same step size, gamma, and original learning rate. Did I misunderstand something?
Thanks. I was mistake when copy paste. Actually, the first net used betas=(0.5, 0.999) while the second net use betas=(0.9, 0.999). That is reason why I used two seperated scheduler