Setting model.eval() makes accuracy much worse

llll1l1l1l1l1l1l1l1 · April 30, 2020, 5:33am

Hi all

I’m new to this forum but have some experience with ml, cnn and pytorch, image vision

I’m trying to use transfer learning to fine-tune a resnet18 on my image classification

everything seems fine except one strange point when I’m using model.eval()

codes are roughly like:


for epoch in range(30):

    resnet.train()
    train_pred=[]
    train_true=[]
    for data in trainloader:

        img, lbl = data['image'].to(device),data['label'].to(device)

        
        optimizer.zero_grad()
        
        logits = resnet(img)
        

        loss=criterion(logits,lbl)
        preds=logits.max(dim=1)[1]

        loss.backward()
        
        optimizer.step()

        
    
        train_pred.append(preds.clone().detach().cpu().numpy())
        train_true.append(lbl.clone().detach().cpu().numpy())
       

    train_true = np.concatenate(train_true)
    train_pred = np.concatenate(train_pred)


    print('epoch {:d}: train mean accuracy: {:.3f}'.format(epoch+1,accuracy_score(train_true,train_pred)),end=' ')
    


    vald_pred=[]
    vald_true=[]

    # !!!
    resnet.eval()  # Setting this makes result even worse

    for data in valdloader:

        img = data['image']
        lbl = data['label']
        

        img, lbl = img.to(device), lbl.to(device)
        logits = resnet(img)
        
        preds=logits.max(dim=1)[1]
        vald_pred.append(preds.clone().detach().cpu().numpy())
        vald_true.append(lbl.clone().detach().cpu().numpy())
        

    vald_true = np.concatenate(vald_true)
    vald_pred = np.concatenate(vald_pred)
   


    print('valid mean accuracy: {:.3f}'.format(accuracy_score(vald_true,vald_pred)))

I believe it’s suggested to set resnet.eval() in above snippet

However, if I set this, validation accuracy is 38% and it fluctuates a lot(epoch 25-30 seems in same accuracy to epoch 1-5)(giving me a feeling that something is wrong, and I don’t really trained my model…)

if I don’t set it, validation accuracy is from 75% and generally, gradually increasing to 85% at last several epochs
(it also fluctuates, but much more like a normal training procedure)

(in both cases, training accuracy ever increases from 60% to 97%)

Any suggestions on this?

should I disable it in favor of validation accuracy?

big thanks in advance

llll1l1l1l1l1l1l1l1 · April 30, 2020, 5:38am

Also, after inspecting image-pred-true output of my model in test epochs

it almost just outputs one label(first one) for all images
( i have 4 classes, but first one is in roughly 1/3 proportion)

llll1l1l1l1l1l1l1l1 · April 30, 2020, 6:55am

Anyway, after reading this very meaningful and useful post

I tried to increase my batch_size from 12 to 128, and it works

this time, validation accuracy with model.eval() looks much more normal

Still leave this post for your reference, as I think it’s quite interesting