How could I design my own optimizer scheduler

coincheung · October 9, 2018, 2:23am

Hi,

If I need to modify the scheme of my optimizer in an irregular way, how could I do it ?

For example, I have an adam optimizer, and I need it to keep working with its default parameters before the 1000th iteration, then I need to change beta1 to 0.3 and in the following training process, I need its learning rate to decay with the ratio of 0.9999. How could I do it with pytorch ?

kaixin · October 9, 2018, 4:00am

You can custom a function to change the parameters of optimizer as the training goes.

def adjust_optim(optimizer, n_iter):
    if n_iter == 1000:
        optimizer.param_groups[0]['betas'] = (0.3, optimizer.param_groups[0]['betas'][1])
    if n_iter > 1000:  
        optimizer.param_groups[0]['lr'] *= 0.9999

If you are using multiple params groups, change 0 to i for ith group.

Best,

coincheung · October 9, 2018, 4:10am

Many thanks, BTW, how could I tell I am using param_groups or defaults. It appears that I can print out both optim.defaults and optim.paarm_groups, they are both exist in my optimizer.

kaixin · October 9, 2018, 4:30am

It depends how you construct the optimizer.
If you do

optimizer = optim.SGD(model.parameters(), lr = 0.01, momentum=0.9)

that means you only have one param group.
If you do

optim.SGD([
                {'params': model.base.parameters()},
                {'params': model.classifier.parameters(), 'lr': 1e-3}
            ], lr=1e-2, momentum=0.9)

that means you have two param groups.

Sorry, I edited the first post. It seems that changing .defaults won’t change the optimizer setup for param group 0 (when there is only one group).

Sorry, I should have searched enough before answering the question.

coincheung · October 9, 2018, 5:28am

Thanks a lot, I seem to get the idea now.

SpandanMadan · October 9, 2018, 5:33am

Check out the function exp_lr_scheduler in my fine tuning tutorial linked below. That lets you decay the LR. I just multiply it by a constant but you can do anything fancy you want using the same code structure.

github.com

Spandan-Madan/Pytorch_fine_tuning_Tutorial/blob/master/main_fine_tuning.py

### Section 1 - First, let's import everything we will be needing.

from __future__ import print_function, division
import torch
import torch.nn as nn
import torch.optim as optim
from torch.autograd import Variable
import numpy as np
import torchvision
from torchvision import datasets, models, transforms
import matplotlib.pyplot as plt
import time
import copy
import os
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
from fine_tuning_config_file import *

## If you want to keep a track of your network on tensorboard, set USE_TENSORBOARD TO 1 in config file.

This file has been truncated. show original

coincheung · October 9, 2018, 6:30am

Thanks for your helpful tutorial!! However, I have a question that may not be so relevant to this.
I noticed that caffe allows to set different learning rates for weight and bias tensor of a conv layer(usually lr for weight and 2*lr for bias). Could I do this with pytorch without too much tedious work to construct the param_groups for the optimizer ?