I made a little mistake in my code (I think at least …), where I put the following :
# Optimizer wd = 0.01 optimizer = optim.Adam(model.parameters(), lr=10e-4, weight_decay=wd) # LR scheduler scheduler = optim.lr_scheduler.OneCycleLR(optimizer, max_lr=1e-4, steps_per_epoch=len(train_loader), epochs=epochs, pct_start=0.3) # only available on pytorch v1.3.0
Notice that in the optimizer I put a learning rate of 10e-4, whereas I put the max_lr on OneCycleLR to 1e-4. So in this situation what does it do? I am a little bit confuse, could anyone light me up please?