I see that the new pytorch comes with a much needed learning rate scheduler. Actually, more of them. I want to use the torch.optim.lr_scheduler.MultiStepLR
, since I need it to divide my learning rate at certain milestones.
What I do is:
learning_rate = 0.1
momentum = 0.9
optimizer = optim.SGD(net.parameters(), lr=learning_rate, momentum=momentum, weight_decay=0.0001)
scheduler = MultiStepLR(optimizer, milestones=[60,100,150,400], gamma=0.1)
for epoch in range(number_of_training_epochs): # loop over the dataset multiple times
for i, data in enumerate(train_loader, 0):
inputs, labels = data
inputs = inputs.float()
inputs, labels = Variable(inputs.cuda()), Variable(labels.cuda())
optimizer.zero_grad()
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
scheduler.step()
What I see is that now my network is not learning anything. Should I use optimizer.step also ?