Weird behavior when obtaining training accuracy


I am new to PyTorch and have been working on the CIFAR10 dataset. When I try to plot the accuracy history over the training epochs, I see a weird behavior. The accuracy up to the penultimate epochs are all negligible, after which it shoots up to something that is reasonable. I get something like the following

The training loop that I have is

def train(dataloader, model, loss_fn, optimizer, num_epochs):
    size = len(dataloader.dataset)
    batch_size = len(dataloader)
    loss_train = np.zeros(num_epochs) # Loss history per epoch during training
    acc_train = np.zeros(num_epochs) # Accuracy history per epoch during training
    running_correct = 0 # Keeps track of the number of correct classifications
    running_total = 0
    for epoch in range(num_epochs):
        for i, (image, label) in enumerate(dataloader):
            image, label =,
            Calculate prediction and loss function per epoch
            label_pred = model(image)
            loss = loss_fn(label_pred, label)
            Backpropagating the loss function to re-adjust weights

            loss_train[epoch] += loss.item()*image.size(0) # Accumulated loss per training batch
            _, predicted = torch.max(label_pred, 1)
            running_total += label.size(0)
            running_correct += (predicted == label).sum().item()
            acc_train[epoch] = running_correct # Number of correct predictions per training batch

            if (i + 1) % 2500 == 0:
                print (f'Epoch [{epoch+1}/{num_epochs}], Step [{i+1}/{batch_size}], Loss: {loss:.4f}')

        loss_train[epoch] = loss_train[epoch]/size
        acc_train = acc_train/running_total # Proportion of accurate predictions      
    return loss_train, acc_train

I am not sure why I am seeing this behavior.


Your running_{variables} should be within the loop so they are refreshed each epoch.

The acc_train[epoch] should be outside the epoch loop.

Also look into this: Order of backward(), step() and zero_grad() for your ordering of backwards and zerograd.

Generally, it is more with the data or the hyper-parameters which cause weird trends like this and not a broken training logic. Is this happening every time or is this one off?

Thanks for the answer. It solved the problem. I was also updating one of the arrays incorrectly, so it is all good now.