There is not any change in training loss and validation loss after I changed the learning rate

Lincoln10153 · November 12, 2020, 9:37am

Hi,

I am new to PyTorch, and I encountered a question.
There is not any change in training loss and validation loss after I changed the learning rate several times (as shown below). But if I change the batchsize, training loss and validation loss on each epoch changed.

I wanna know why the training loss and validation loss on each epoch did not change.

Thanks!

SGD 0.001
-------------------------
7401
1.2.0
There are 1 CUDA devices
Setting torch GPU to 0
Using device:0 
begin training!
Epoch:1/150
Train_loss:1.79059
Vali_loss:1.79172
Time_elapse:13.118939876556396
'
Epoch:2/150
Train_loss:1.78832
Vali_loss:1.78894
Time_elapse:23.398069381713867
'
Epoch:3/150
Train_loss:1.78577
Vali_loss:1.78627
Time_elapse:33.67959260940552
'
...

SGD 0.01
-------------------------
7401
1.2.0
There are 1 CUDA devices
Setting torch GPU to 0
Using device:0 
begin training!
Epoch:1/150
Train_loss:1.79059
Vali_loss:1.79172
Time_elapse:13.118939876556396
'
Epoch:2/150
Train_loss:1.78832
Vali_loss:1.78894
Time_elapse:23.398069381713867
'
Epoch:3/150
Train_loss:1.78577
Vali_loss:1.78627
Time_elapse:33.67959260940552
'
...

Usama_Hasan · November 12, 2020, 9:38am

Hy @Lincoln10153
Can you share a code snippet.

Lincoln10153 · November 12, 2020, 9:47am

Thanks for your reply!

global training, prediction, cuda
training = True
prediction = True
cuda = True  
seed = 5

def run(opt="SGD", gpu=0, lr=0.001, lr_schedule=False, identifier='', scene_classification=False):

    # directories
    model_dir = r'.\model'
    train_dir = r'C:\Users\train_img'
    vali_dir = r'C:\Users\vali_img'
    test_dir = r'C:\Users\test_img'
    torch.manual_seed(seed)
    if torch.cuda.is_available():
        print (torch.__version__)
        if not cuda:
            print("You have a CUDA device")
        else:
            torch.cuda.set_device(gpu)
            torch.cuda.manual_seed(seed)

    # building the net
    model = Unet_6(features=[32, 64]) 

    if cuda:
        model = model.cuda()
    new_identifier = '8' + identifier
    trainer = Trainer(net=model, train_dir=train_dir, vali_dir=vali_dir, test_dir=test_dir, model_dir=model_dir,
                      opt=opt, lr=lr, cuda=cuda, identifier=new_identifier, lr_schedule=lr_schedule)


    # training
    if training:
        bs = 8
        trainer.train_model(epoch=150, bs=bs)

Lincoln10153 · November 12, 2020, 9:51am

Thanks for your reply!!

global training, prediction, cuda
training = True
prediction = True
cuda = True  
seed = 5

def run(opt="SGD", gpu=0, lr=0.001, lr_schedule=False, identifier='', scene_classification=False):

    # directories
    model_dir = r'.\model'
    train_dir = r'C:\Users\train_img'
    vali_dir = r'C:\Users\vali_img'
    test_dir = r'C:\Users\test_img'
    torch.manual_seed(seed)
    if torch.cuda.is_available():
        print (torch.__version__)
        if not cuda:
            print("You have a CUDA device")
        else:
            torch.cuda.set_device(gpu)
            torch.cuda.manual_seed(seed)

    # building the net
    model = Unet_6(features=[32, 64]) 

    if cuda:
        model = model.cuda()
    new_identifier = '8' + identifier
    trainer = Trainer(net=model, train_dir=train_dir, vali_dir=vali_dir, test_dir=test_dir, model_dir=model_dir,
                      opt=opt, lr=lr, cuda=cuda, identifier=new_identifier, lr_schedule=lr_schedule)


    # training
    if training:
        bs = 8
        trainer.train_model(epoch=150, bs=bs)

Usama_Hasan · November 12, 2020, 11:22am

Can you share you Trainer function. It’s unclear what we’re doing wrong here.

Lincoln10153:

 Trainer(net=model, train_dir=train_dir, vali_dir=vali_dir, test_dir=test_dir, model_dir=model_dir,
                      opt=opt, lr=lr, cuda=cuda, identifier=new_identifier, lr_schedule=lr_schedule)

Lincoln10153 · November 12, 2020, 11:53am

Thanks!
I double checked my code just now, and I think I find the question finally. Because I used learning rate decay, and the learning rate I set was not considered during the training process.

Usama_Hasan · November 12, 2020, 12:41pm

Great That’s good to know.