Model.eval() accuracy is 0 and running_corrects is 0

lima · June 4, 2021, 9:54am

I’m having an issue with my DNN model.
During train phase, the accuracy is 0.968 and the loss is 0.103, but during test phase with model.eval(), the accuracy is 0 and the running corrects is 0.

def train(model, device, train_loader, criterion, optimizer, scheduler, epoch, iter_meter, experiment):
    model.train()
    liveloss = PlotLosses()
    data_len = len(train_loader.dataset)
    with experiment.train():
        
        logs = {}
        running_loss = 0.0
        running_corrects = 0
        
        for batch_idx, _data in enumerate(train_loader):
            features, labels = _data[:][:,:,:-1], _data[..., -1]
            features = features.permute(0, 2, 1)

            features, labels = features.to(device), labels.to(device) 
            
            optimizer.zero_grad()

            output = model(features) 
            loss = criterion(output, torch.max(labels, 1)[1])
            loss.backward()

            experiment.log_metric('loss', loss.item(), step=iter_meter.get())
            experiment.log_metric('learning_rate', scheduler.get_last_lr(), step=iter_meter.get())

            optimizer.step()
            scheduler.step()
            iter_meter.step()
            
            _, preds = torch.max(output, 1)
            running_loss += loss.detach() * features.size(0)
            running_corrects += torch.sum(preds == torch.max(labels, 1)[1])
        epoch_loss = running_loss / len(train_loader.dataset)
        epoch_acc = running_corrects.float() / len(train_loader.dataset)
        logs['log loss'] = epoch_loss.item()
        logs['accuracy'] = epoch_acc.item()
    liveloss.update(logs)
    liveloss.send()


iter_meter = IterMeter()
for epoch in range(1, epochs + 1):
    train(model, device, train_loader, criterion, optimizer, scheduler, epoch, iter_meter, experiment)

The evaluation script is:


def test(model, device, tst_loader, criterion, epoch, iter_meter, experiment):
    print('\nevaluating...')
    model.eval()
    test_loss = 0
    liveloss = PlotLosses()
    data_len = len(tst_loader.dataset)
    with experiment.test():
        with torch.no_grad():
        
            logs = {}
            running_loss = 0.0
            running_corrects = 0

            for batch_idx, _data in enumerate(tst_loader):
                features, labels = _data[:][:,:,:-1], _data[..., -1]
                features = features.permute(0, 2, 1)

                features, labels = features.to(device), labels.to(device)


                output = model(features) 
                loss = criterion(output, torch.max(labels, 1)[1])
                test_loss += loss.item() / len(tst_loader)

                experiment.log_metric('loss', loss.item(), step=iter_meter.get())

                iter_meter.step()

                _, preds = torch.max(output, 1)
                running_loss += loss.detach() * features.size(0)
                running_corrects += torch.sum(preds == torch.max(labels, 1)[1])
                
            epoch_loss = running_loss / len(tst_loader.dataset)
            epoch_acc = running_corrects.float() / len(tst_loader.dataset)
            logs['log loss'] = epoch_loss.item()
            logs['accuracy'] = epoch_acc.item()
    liveloss.update(logs)
    liveloss.send()

iter_meter = IterMeter()
for epoch in range(1, epochs + 1):
    test(model, device, tst_loader, criterion, epoch, iter_meter, experiment)

Is there something with the code?

Henry_Chibueze · June 4, 2021, 10:23am

I want to know what version of python this is.
Some previous versions of python do not support +=
This is not likely the issue tho coz it’ll flag an error.

Now I noticed u are not saving the model after training.
U are passing the model into the training function as an argument and u are not returning it after training.
When u then pass the model into the testing function, it doesn’t pass the already trained model.

So try making model a global variable so that whatever changes u make to it from the training function, it’ll reflect on the testing function.

Or

After training the model, return the already trained model and make model = train()

lima · June 4, 2021, 10:27am

Thank s for the reply. The += operator worked on model.train(). So I don’t think it’s the culprit.

Henry_Chibueze · June 4, 2021, 10:39am

Look at this image and correlate it with what my edit was talking about.
when i printed x, it did not change because I only multiplied the x that I passed as a parameter of the ‘me’ function by 4 and not the original x.
so x still remains 4.

lima · June 4, 2021, 5:09pm

@ptrblck can you help please?

ptrblck · June 4, 2021, 6:17pm

I’m not sure about the background of the previous discussion about the inplace operation.

For your initial problem: are you seeing the same accuracy drop after you call model.eval() and use your test dataset for the sake of debugging?

lima · June 4, 2021, 6:35pm

I didn’t get your point, but I call model.train() with the exact script above, on train data it shows accuracy is 0.968 and the loss is 0.103.
Then I call model.eval(), and the accuracy is always 0.0000.
Do I have to save the model first before I call model.eval()?

ptrblck · June 4, 2021, 6:38pm

No, you don’t need to save the model before calling eval().
Are you seeing the performance drop on both, the training and validation dataset, after calling model.eval()?

lima · June 4, 2021, 7:00pm

Sorry I’m a newbie. So I may be doing it all wrong.
When I call model.eval() it just uses the test loader, so I’m not sure how the train performance could be affected.
Please bear with me. Is there anything in the code above rhat looks off to you?
Here’s what I’m doing.
On jupyter norebook.
I set the train loader and test loader.
On one cell I run the train script
Once train is done, I run the test script

Henry_Chibueze · June 6, 2021, 8:10am

@lima @ptrblck
As I said earlier if you read my comment.
You passed the model into the train function and trained it without returning the it after training.
So the model you passed into the test function is not a trained model but rather a model that is yet to be trained.
The training and weight updates only happened in the train function, so if u call the model any other place, it won’t have those trained weights and It’ll still be the untrained model it is…simply because you passed the model as a parameter to the train function and made the trained model peculiar to just the train function.

So you either make the model global and make the train and test function to access and use it

Or

You either return the trained model after you trained it in the train function and assign it’s call to a variable for inference

Or

You save the already trained model and load it when you want to make inference.

Henry_Chibueze · June 6, 2021, 8:19am

@lima @ptrblck
Make the model global so the train and test function can both access it without passing it individually as a parameter

Given that: model = the_model_that_you_declared_outside()

def train(device, train_loader, criterion, optimizer, scheduler, epoch, iter_meter, experiment):
    model.train()
    liveloss = PlotLosses()
    data_len = len(train_loader.dataset)
    with experiment.train():
        
        logs = {}
        running_loss = 0.0
        running_corrects = 0
        
        for batch_idx, _data in enumerate(train_loader):
            features, labels = _data[:][:,:,:-1], _data[..., -1]
            features = features.permute(0, 2, 1)

            features, labels = features.to(device), labels.to(device) 
            
            optimizer.zero_grad()

            output = model(features) 
            loss = criterion(output, torch.max(labels, 1)[1])
            loss.backward()

            experiment.log_metric('loss', loss.item(), step=iter_meter.get())
            experiment.log_metric('learning_rate', scheduler.get_last_lr(), step=iter_meter.get())

            optimizer.step()
            scheduler.step()
            iter_meter.step()
            
            _, preds = torch.max(output, 1)
            running_loss += loss.detach() * features.size(0)
            running_corrects += torch.sum(preds == torch.max(labels, 1)[1])
        epoch_loss = running_loss / len(train_loader.dataset)
        epoch_acc = running_corrects.float() / len(train_loader.dataset)
        logs['log loss'] = epoch_loss.item()
        logs['accuracy'] = epoch_acc.item()
    liveloss.update(logs)
    liveloss.send()


iter_meter = IterMeter()
for epoch in range(1, epochs + 1):
    train(device, train_loader, criterion, optimizer, scheduler, epoch, iter_meter, experiment)
    
#The evaluation script is:


def test(device, tst_loader, criterion, epoch, iter_meter, experiment):
    print('\nevaluating...')
    model.eval()
    test_loss = 0
    liveloss = PlotLosses()
    data_len = len(tst_loader.dataset)
    with experiment.test():
        with torch.no_grad():
        
            logs = {}
            running_loss = 0.0
            running_corrects = 0

            for batch_idx, _data in enumerate(tst_loader):
                features, labels = _data[:][:,:,:-1], _data[..., -1]
                features = features.permute(0, 2, 1)

                features, labels = features.to(device), labels.to(device)


                output = model(features) 
                loss = criterion(output, torch.max(labels, 1)[1])
                test_loss += loss.item() / len(tst_loader)

                experiment.log_metric('loss', loss.item(), step=iter_meter.get())

                iter_meter.step()

                _, preds = torch.max(output, 1)
                running_loss += loss.detach() * features.size(0)
                running_corrects += torch.sum(preds == torch.max(labels, 1)[1])
                
            epoch_loss = running_loss / len(tst_loader.dataset)
            epoch_acc = running_corrects.float() / len(tst_loader.dataset)
            logs['log loss'] = epoch_loss.item()
            logs['accuracy'] = epoch_acc.item()
    liveloss.update(logs)
    liveloss.send()

iter_meter = IterMeter()
for epoch in range(1, epochs + 1):
    test(device, tst_loader, criterion, epoch, iter_meter, experiment)

Or you can return the model after training in the train function like so:

def train(model, device, train_loader, criterion, optimizer, scheduler, epoch, iter_meter, experiment):
    model.train()
    liveloss = PlotLosses()
    data_len = len(train_loader.dataset)
    with experiment.train():
        
        logs = {}
        running_loss = 0.0
        running_corrects = 0
        
        for batch_idx, _data in enumerate(train_loader):
            features, labels = _data[:][:,:,:-1], _data[..., -1]
            features = features.permute(0, 2, 1)

            features, labels = features.to(device), labels.to(device) 
            
            optimizer.zero_grad()

            output = model(features) 
            loss = criterion(output, torch.max(labels, 1)[1])
            loss.backward()

            experiment.log_metric('loss', loss.item(), step=iter_meter.get())
            experiment.log_metric('learning_rate', scheduler.get_last_lr(), step=iter_meter.get())

            optimizer.step()
            scheduler.step()
            iter_meter.step()
            
            _, preds = torch.max(output, 1)
            running_loss += loss.detach() * features.size(0)
            running_corrects += torch.sum(preds == torch.max(labels, 1)[1])
        epoch_loss = running_loss / len(train_loader.dataset)
        epoch_acc = running_corrects.float() / len(train_loader.dataset)
        logs['log loss'] = epoch_loss.item()
        logs['accuracy'] = epoch_acc.item()
    liveloss.update(logs)
    liveloss.send()
    return model


iter_meter = IterMeter()
for epoch in range(1, epochs + 1):
   model = train(model, device, train_loader, criterion, optimizer, scheduler, epoch, iter_meter, experiment)
    
#The evaluation script is:


def test(model, device, tst_loader, criterion, epoch, iter_meter, experiment):
    print('\nevaluating...')
    model.eval()
    test_loss = 0
    liveloss = PlotLosses()
    data_len = len(tst_loader.dataset)
    with experiment.test():
        with torch.no_grad():
        
            logs = {}
            running_loss = 0.0
            running_corrects = 0

            for batch_idx, _data in enumerate(tst_loader):
                features, labels = _data[:][:,:,:-1], _data[..., -1]
                features = features.permute(0, 2, 1)

                features, labels = features.to(device), labels.to(device)


                output = model(features) 
                loss = criterion(output, torch.max(labels, 1)[1])
                test_loss += loss.item() / len(tst_loader)

                experiment.log_metric('loss', loss.item(), step=iter_meter.get())

                iter_meter.step()

                _, preds = torch.max(output, 1)
                running_loss += loss.detach() * features.size(0)
                running_corrects += torch.sum(preds == torch.max(labels, 1)[1])
                
            epoch_loss = running_loss / len(tst_loader.dataset)
            epoch_acc = running_corrects.float() / len(tst_loader.dataset)
            logs['log loss'] = epoch_loss.item()
            logs['accuracy'] = epoch_acc.item()
    liveloss.update(logs)
    liveloss.send()

iter_meter = IterMeter()
for epoch in range(1, epochs + 1):
    test(model, device, tst_loader, criterion, epoch, iter_meter, experiment)

lima · June 6, 2021, 3:47pm

@Henry_Chibueze @ptrblck

I think I can pinpoint the issue but I still can’t get it solved.
The problem is I don’t know how to get the predictions to look like the labels.

test_acc = 0.0
for _data in loaders['test']:
    with torch.no_grad():
        data, target = _data[:][:,:,:-1], _data[..., -1]
        data = data.permute(0, 2, 1)
        data, target = data.to(device), target.to(device)
        output = trained_model(data)
        # calculate accuracy
        _, pred = torch.max(output, dim=1) 

        correct = torch.sum(pred.eq(torch.max(target, 1)[1]))
        print(f'prediction: {pred}')
        print(f'label: {(torch.max(target, 1)[1])}')

        test_acc += torch.mean(correct.float())
print('Accuracy of the network on {} test frames: {}%'.format(len(tst_data3), round(test_acc.item()*100.0/len(loaders['test']), 2)))

If I print the prediction and the label, I get different values:

print(f'prediction: {pred}')

prediction: tensor([194,  86, 492, 492, 132, 132, 263, 216, 241,  17, 263, 216, 127, 399,
        492, 500, 420, 263, 390, 510, 216, 510, 132, 194, 263, 217, 263,  23,
        216, 395, 132, 297, 390, 194, 263, 492, 114, 216, 194, 503,  20, 217,
        297, 477, 476, 263, 263, 479, 466, 500, 132, 263, 361, 194,  92, 510,
        393, 216, 500,  53, 194, 510, 216, 216, 269, 216, 228, 194, 119, 415,
        477, 477, 114, 477, 476, 477, 503, 194, 170, 216, 263, 194, 221, 503,
        263, 466, 263, 114, 263, 492,  46, 449, 286, 286, 263, 132, 420, 406,
        492, 194, 263, 194, 263, 263, 194, 477, 114, 119, 477, 510, 241, 492,
        286, 263, 170, 216, 216, 194, 127, 263, 492, 477, 216, 114, 194, 360,
        479, 390], device='cuda:0')

print(f'label: {(torch.max(target, 1)[1])}')

label: tensor([3, 0, 0, 0, 1, 0, 6, 0, 2, 0, 5, 0, 1, 0, 4, 0, 0, 0, 0, 0, 0, 4, 0, 0,
        3, 0, 4, 2, 0, 6, 0, 0, 3, 0, 0, 3, 0, 0, 0, 1, 0, 0, 5, 0, 0, 6, 0, 0,
        0, 6, 0, 5, 0, 0, 6, 0, 1, 0, 4, 0, 0, 4, 0, 1, 0, 1, 0, 0, 0, 4, 0, 0,
        0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 4, 0, 0, 6, 0, 0, 5, 0,
        0, 0, 1, 0, 3, 0, 3, 4, 0, 5, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 3, 0,
        0, 0, 1, 0, 1, 1, 0, 6], device='cuda:0')

ptrblck · June 6, 2021, 11:12pm

I don’t think that’s correct as seen e.g. here:

import torch
import torch.nn as nn

def train(model, optimizer):
    for epoch in range(10):
        optimizer.zero_grad()
        output = model(torch.randn(1, 1))
        loss = criterion(output, torch.randn(1, 1))
        loss.backward()
        optimizer.step()
        print('epoch {}, loss {}, weight {}'.format(
            epoch, loss.item(), model.weight))

if __name__=='__main__':
    model = nn.Linear(1, 1)
    print('before training')
    print(model.weight)
    
    optimizer = torch.optim.SGD(model.parameters(), lr=1.)
    criterion = nn.MSELoss()
    
    train(model, optimizer)
    print('after training')
    print(model.weight)

As you can see, the model is created in the if-clause guard, passed to the train function, and is not returned.
Nevertheless, the weight updates will be performed inplace on the model reference and thus the model in the same scope will contain the upgraded parameters.