Inconsistency in lambda learning rate

So, I had to implement my own Cosine Annealing learning rate because the one given by the PyTorch does the LR warmup in a different way therefore I implemented it myself and put it as lambdaLR. Now the problem is if I print the learning rate from inside my lambda function then it prints the expected value but when I print it using the code optimizer.param_groups[0]["lr"] it is exactly 20 times lower and even the model actually trains on a learning rate 20 times lower than the expected one. This I can verify by using some other learning rate scheduler instead of the lambda scheduler. What can be possibly wrong? If the issue doesn’t get solved by the aforementioned description then I will try to reproduce it on a toy problem and put the code here.

If you are interested in LR warmup schedulers, try pytorch_warmup.

Colab example | PyPI

I am having the same issue. Did you find the reason behind this?