Hey friendly people,
I am currently testing a bunch of learning rate schedulers and I was wondering if it is intentional behavior, that the CosineAnnealingLR does not reset (at least if coupled with an Adam optimizer) and the learning rate does follow a full cosine.
In contrast to that the CosineAnnealingWarRestarts in default configuration does have resets.
From my understanding of those methods their behavior in default configuration should be identical, or am I missing something? Maybe CosineAnnealingLR is waiting for an reset signal from the optimizer because it is intended to be coupled with momentum SGD but it does not come because I used Adam?
Would be great if someone could help me with this.
EDIT: I revisited the documentation of the CosineAnnealingLR and as I understand it, it is not intended to reset at all. So the parameter T_max does only alter the period of the cosine right?
I did a little mokup to test this, if it is of any help:
import torch import matplotlib.pyplot as plt params = torch.tensor((10,10)) opt = torch.optim.Adam([params]) cos = torch.optim.lr_scheduler.CosineAnnealingLR(opt, 10) warm = torch.optim.lr_scheduler.CosineAnnealingWarmRestarts(opt, 10) warm_log =  cos_log =  for i in range(100): warm_log.append(*warm.get_last_lr()) cos_log.append(*cos.get_last_lr()) warm.step() cos.step() plt.figure() plt.plot(warm_log) plt.plot(cos_log) plt.show()