Hey,
I have defined the following optimizer with different learning rates for each parameter group:
optimizer = optim.SGD([
{'params': param_groups[0], 'lr': CFG.lr, 'weight_decay': CFG.weight_decay},
{'params': param_groups[1], 'lr': 2*CFG.lr, 'weight_decay': 0},
{'params': param_groups[2], 'lr': 10*CFG.lr, 'weight_decay': CFG.weight_decay},
{'params': param_groups[3], 'lr': 20*CFG.lr, 'weight_decay': 0},
], lr=CFG.lr, momentum=0.9, weight_decay=CFG.weight_decay, nesterov=CFG.nesterov)
Now I want to use a LR-Scheduler to update all the learning rates and not only the first one, because by deafult, a scheduler would only update the param_groups[0]?
scheduler = torch.optim.lr_scheduler.CosineAnnealingWarmRestarts(optimizer, T_0=5, T_mult=2, eta_min=CFG.min_lr, last_epoch=-1, verbose=True)
Giving me:
Parameter Group 0
dampening: 0
initial_lr: 0.001
lr: 0.0009999603905218616
momentum: 0.9
nesterov: True
weight_decay: 0.0001
Parameter Group 1
dampening: 0
initial_lr: 0.002
lr: 0.002
momentum: 0.9
nesterov: True
weight_decay: 0
Parameter Group 2
dampening: 0
initial_lr: 0.01
lr: 0.01
momentum: 0.9
nesterov: True
weight_decay: 0.0001
Parameter Group 3
dampening: 0
initial_lr: 0.02
lr: 0.02
momentum: 0.9
nesterov: True
weight_decay: 0
)
after on update.
Any idea how to update all the learning rates with a scheduler?