How to choose Learning Rate and Scheduler

I’m using torchvision’s models ResNet18, EfficientNet B0 for training on CIFAR-10, CIFAR-100. The ResNet50 model is converging when I set the learning rate to 0.00025 but, when I change the learning rate to 0.001 or 0.01 the model is not learning at all, i.e; the loss stays constant. I think the issue might be that the gradients might be too huge for backprop. I have also tried to use the Cosine Annealing scheduler and set the weight decay to 1e-4 and 0.05, but still no improvement. Could you please point out where i’m going wrong?

You didn’t mention which optimizer you used. Here is an article with learning rates used for ResNet18 on CIFAR-10. Perhaps you can use the article to compare your results.–VmlldzoxMDE2NjQ1

Hi, Thanks for your reply. Could you please share the code which you used for your runs? Thanks!

Here is a kaggle notebook:

Optimizer and scheduler are defined here in the same notebook:

1 Like