# How to increase the learning rate without using cyclical learning rates?

Hi,

I’d have a quick question: In the ResNet paper, in section 4.2, an architecture is described for the CIFAR10 dataset. When they describe using a very deep ResNet, they write:

So we use 0.01 to warm up the training until the training error is below
80% (about 400 iterations), and then go back to 0.1 and continue training.

Therefore, I would like to have the following learning rate schedule:

• LR = 10^{-2} for the first two epochs,
• LR = 10^{-1} for the rest of the training.

However, I haven’t found a way to achieve exactly this (cyclical learning rates would only gradually increase the learning rate, not at once). Any help would be appreciated. (-:

``````import torch
import torch.nn as nn
from torch.optim.lr_scheduler import ConstantLR , ReduceLROnPlateau, SequentialLR

model = nn.Sequential(nn.Linear(2,50),
nn.Sigmoid(),
nn.Linear(50,50),
nn.Sigmoid(),
nn.Linear(50,2)
)

criteria = nn.CrossEntropyLoss()
optim =  torch.optim.SGD(model.parameters(),lr=.1,momentum=.9)

def get_toyData(N1,N2):
labels = torch.cat([torch.ones(N1),torch.zeros(N2)])
data = torch.cat([torch.randn(N1,2),torch.randn(N2,2)+1.5])
return data , labels

N1 , N2 = 1000,1000
train_data , train_labels = get_toyData(N1,N2)

dataset = torch.utils.data.TensorDataset(train_data,train_labels)
shuffle=True)

scheduler1 = ConstantLR(optim, factor=0.1, total_iters=40)
#scheduler2 = ExponentialLR(optim, gamma=0.9)
scheduler2 = ReduceLROnPlateau(optim, 'min')

for i in range(100):
totalLoss = 0
train_error = 0
for x,t in dl:
out=model(x)
loss = criteria(out,t.long())
loss.backward()
optim.step()
totalLoss += loss.item()
if i <40:

scheduler1.step()
else:
scheduler2.step(totalLoss)
print(i , optim.state_dict()['param_groups'][0]['lr'])

#print(i,totalLoss,train_error/1300)
``````

Assuming optimizer uses lr = 0.1 for all groups
lr = 0.01 if epoch < 40
lr = 0.1 if epoch =40
and ReduceLROnPlateau by .1 if epoch > 40

First of all: Thanks! My apologies, but there’s one thing I forgot to mention in my original post, I’m afraid. Let me quote from the ResNet paper, section 4.2 again:

So we use 0.01 to warm up the training until the training error is below
80% (about 400 iterations), and then go back to 0.1 and continue training. The rest of the learning schedule is as done previously.

By “as done previously”, this is meant:

We start with a learning rate of 0.1, divide it by 10 at 32k and 48k iterations, and
terminate training at 64k iterations, […].

Basically, something like this should be our learning rate schedule:
lr = 0.01 if num_iters < 400,
lr = 0.1 if 400 <= num_iters < 32k,
lr = 0.01 if 32k <= num_iters < 48k,
lr = 0.001 if 48k <= num_iters < 60k.

I’d also be happy to use epochs instead of number of iterations, but I’m not sure how to achieve either. )-:

hi,

``````scheduler1 = ConstantLR(optim, factor=0.1, total_iters=400)
scheduler2 = MultiStepLR(optim, milestones=[32000,48000], gamma=0.1)

# in training inner loop
for j, x,t in enumerate(dl):
out=model(x)
loss = criteria(out,t.long())
loss.backward()
optim.step()
totalLoss += loss.item()
if i*len(dl) + j <400:
scheduler1.step()
scheduler2.step()
else:
scheduler2.step()
``````

I’m not sure that `if` statement is necessary anymore.
also, You can use LAMBDALR to do anything you want.

Hi! Actually, I think that the if-statement is indeed not needed:

``````scheduler1 = ConstantLR(optimizer, factor=0.01, total_iters=400)
scheduler2 = MultiStepLR(optimizer, milestones=[32000, 48000], gamma=0.1)
chained_scheduler = ChainedScheduler([scheduler1, scheduler2])

for (batch_idx, data, labels) in enumerate(train_loader):
...
chained_scheduler.step()
``````

If I can ask a brief follow-up question: I had actually tried using the LambdaLR for this problem as well, but at the end, I didn’t really know how to implement it… I’d be happy if you showed me how! (-:

this should work.

``````def _lr_lambda(current_step):
"""
_lr_lambda returns a multiplicative factor given an interger parameter epochs.
"""

if current_step < 400:
_lr =.1
elif current_step < 32000:
_lr = 1
elif current_step < 48000:
_lr = .1
else:
_lr = .01

return _lr

scheduler = LambdaLR(optimizer, _lr_lambda, last_epoch)
``````