I am using adam optimizer and 100 epochs of training for my problem. I am wondering which of the following two learning rate schedulers sound better?
optimizer = torch.optim.Adam(model.parameters(
), lr=learning_rate, weight_decay=5e-4) # best:5e-4, 4e-3
##exp_lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[20, 40, 90], gamma=0.1) # gamma=0.3 # 30,90,130 # 20,90,130 -> 150
# or
# Decay LR by a factor of 0.1 every epoch
exp_lr_scheduler = lr_scheduler.StepLR(optimizer, step_size=1, gamma=0.1)