Calculating accuracy of the current minibatch?

Brando_Miranda · August 4, 2020, 6:23pm

I think the simplest answer is the one from the cifar10 tutorial:

total = 0
with torch.no_grad():
    net.eval()
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
    100 * correct / total))

so:

acc = (true == pred).sum().item()

If you have a counter don’t forget to eventually divide by the size of the data-set or analogous values.

I’ve used:

N = data.size(0) # since usually it's size (batch_size, D1, D2, ...)
correct += (1/N) * correct

Self contained code:

# testing accuracy function
# https://discuss.pytorch.org/t/calculating-accuracy-of-the-current-minibatch/4308/11
# https://stackoverflow.com/questions/51503851/calculate-the-accuracy-every-epoch-in-pytorch

import torch
import torch.nn as nn

D = 1
true = torch.tensor([0,1,0,1,1]).reshape(5,1)
print(f'true.size() = {true.size()}')

batch_size = true.size(0)
print(f'batch_size = {batch_size}')
x = torch.randn(batch_size,D)
print(f'x = {x}')
print(f'x.size() = {x.size()}')

mdl = nn.Linear(D,1)
logit = mdl(x)
_, pred = torch.max(logit.data, 1)

print(f'logit = {logit}')

print(f'pred = {pred}')
print(f'true = {true}')

acc = (true == pred).sum().item()
print(f'acc = {acc}')

Also, I find this code to be good reference:

def calc_accuracy(mdl, X, Y):
    # reduce/collapse the classification dimension according to max op
    # resulting in most likely label
    max_vals, max_indices = mdl(X).max(1)
    # assumes the first dimension is batch size
    n = max_indices.size(0)  # index 0 for extracting the # of elements
    # calulate acc (note .item() to do float division)
    acc = (max_indices == Y).sum().item() / n
    return acc

Explaining pred = mdl(x).max(1)see this How does one get the predicted classification label from a pytorch model?

the main thing is that you have to reduce/collapse the dimension where the classification raw value/logit is with a max and then select it with a .indices. Usually this is dimensions 1 since dim 0 has the batch size e.g. [batch_size,D_classification] where the raw data might of size [batch_size,C,H,W]

A synthetic example with raw data in 1D as follows:

import torch
import torch.nn as nn

# data dimension [batch-size, D]
D, Dout = 1, 5
batch_size = 16
x = torch.randn(batch_size, D)
y = torch.randint(low=0,high=Dout,size=(batch_size,))

mdl = nn.Linear(D, Dout)
logits = mdl(x)
print(f'y.size() = {y.size()}')
# removes the 1th dimension with a max, which is the classification layer
# which means it returns the most likely label. Also, note you need to choose .indices since you want to return the
# position of where the most likely label is (not it's raw logit value)
pred = logits.max(1).indices
print(pred)

print('--- preds vs truth ---')
print(f'predictions = {pred}')
print(f'y = {y}')

acc = (pred == y).sum().item() / pred.size(0)
print(acc)

output:


y.size() = torch.Size([16])
tensor([3, 1, 1, 3, 4, 1, 4, 3, 1, 1, 4, 4, 4, 4, 3, 1])
--- preds vs truth ---
predictions = tensor([3, 1, 1, 3, 4, 1, 4, 3, 1, 1, 4, 4, 4, 4, 3, 1])
y = tensor([3, 3, 3, 0, 3, 4, 0, 1, 1, 2, 1, 4, 4, 2, 0, 0])
0.25

reference: