Calculating accuracy of the current minibatch?

How can I calculate the acuracy for the current mini batch in my training? My trainining code is just:

for epoch in range(args.epochs):
  for i, (images, captions, lengths) in enumerate(train_loader):
    targets = pack_padded_sequence(captions, lengths, batch_first=True)[0]
    features = encoder(images)
    outputs = decoder(features, captions, lengths)
    loss = criterion(outputs, targets)

how can I get how many are correct from that code?
I tried several ways but cant get it to work:
correct = (targets.eq(outputs)).sum()

Im sure there should be a generic way to do this. if criterion can calculate the loss without knowing the shapes, I think I should be able to calculate accuracy as well? Any idea on how to do this?

File “”, line 182, in
File “”, line 139, in main
correct = (ttargets.eq(toutputs)).sum()
File “/home/joy/anaconda3/lib/python3.6/site-packages/torch/autograd/”, line 710, in eq
return Eq()(self, other)
File “/home/joy/anaconda3/lib/python3.6/site-packages/torch/autograd/_functions/”, line 14, in forward
mask = getattr(tensor1, self.fn_name)(other)
TypeError: eq received an invalid combination of arguments - got (torch.cuda.FloatTensor), but expected one of:

  • (int value)
    didn’t match because some of the arguments have invalid types: (torch.cuda.FloatTensor)
  • (torch.cuda.LongTensor other)
    didn’t match because some of the arguments have invalid types: (torch.cuda.FloatTensor)
correct = (ttargets.eq(toutputs.long())).sum()

Thanks ,that fixed the Long issue. I get another issue after that, the tensor sizes do not match.
ttargets.size() is
[torch.cuda.LongTensor of size 11 (GPU 0)]
and toutputs.size() is:
[torch.cuda.FloatTensor of size 11x13 (GPU 0)]

My question is how can I calculate accuracy generically for any tensor like how " loss = criterion(outputs, targets)" can calculate the loss without knowing the details? I looked through the code for criterion (CrossEntropyLoss) and I am confused on how it is calculating loss.

max_index = ttoutputs.max(dim = 1)[1]
(max_index == ttargets).sum()

the detail of max function can be found from docs of torch.max


Does the following not work:

def calc_accuracy(mdl,X,Y):
    # TODO: why can't we call .data.numpy() for train_acc as a whole?
    max_vals, max_indices = torch.max(mdl(X),1)
    train_acc = (max_indices == Y).sum().data.numpy()/max_indices.size()[0]
    return train_acc


seems this works too:

def calc_accuracy(mdl,X,Y):
    max_vals, max_indices = torch.max(mdl(X),1)
    train_acc = (max_indices == Y).sum().item()/max_indices.size()[0]
    return train_acc

if you want the output to be a tensor:

def calc_accuracy(mdl,X,Y):
    """Calculates model accuracy
        mdl {nn.model} -- nn model
        X {torch.Tensor} -- input data
        Y {torch.Tensor} -- labels/target values
        [torch.Tensor] -- accuracy
    max_vals, max_indices = torch.max(mdl(X),1)
    n = max_indices.size(0) #index 0 for extracting the # of elements
    train_acc = (max_indices == Y).sum(dtype=torch.float32)/n
    return train_acc

i use this usually for classification:

def accuracy(true,pred):
    acc = (true.argmax(-1) == pred.argmax(-1)).float().detach().numpy()
    return float(100 * acc.sum() / len(acc))

which true and pred are both a torch tensor


I use the following snippet when having preds, and labels and mask:


do you have a small snippet to test this that is self contained?

Can you explain the negative index?

I know this is basic but it’s not fresh in my head. What is the mask suppose to stand for when they are predictions? is it 0-1 values? What is it suppose to mask off? Unwanted examples…?


can you comment on the true.argmax(-1)? What does the -1 do? ----- hopefully answer will be here someday: Argmax with PyTorch

I think the simplest answer is the one from the cifar10 tutorial:

total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
    100 * correct / total))


acc = (true == pred).sum().item()

If you have a counter don’t forget to eventually divide by the size of the data-set or analogous values.

I’ve used:

N = data.size(0) # since usually it's size (batch_size, D1, D2, ...)
correct += (1/N) * correct

Self contained code:

# testing accuracy function

import torch
import torch.nn as nn

D = 1
true = torch.tensor([0,1,0,1,1]).reshape(5,1)
print(f'true.size() = {true.size()}')

batch_size = true.size(0)
print(f'batch_size = {batch_size}')
x = torch.randn(batch_size,D)
print(f'x = {x}')
print(f'x.size() = {x.size()}')

mdl = nn.Linear(D,1)
logit = mdl(x)
_, pred = torch.max(, 1)

print(f'logit = {logit}')

print(f'pred = {pred}')
print(f'true = {true}')

acc = (true == pred).sum().item()
print(f'acc = {acc}')

Also, I find this code to be good reference:

def calc_accuracy(mdl, X, Y):
    # reduce/collapse the classification dimension according to max op
    # resulting in most likely label
    max_vals, max_indices = mdl(X).max(1)
    # assumes the first dimension is batch size
    n = max_indices.size(0)  # index 0 for extracting the # of elements
    # calulate acc (note .item() to do float division)
    acc = (max_indices == Y).sum().item() / n
    return acc

Explaining pred = mdl(x).max(1)see this How does one get the predicted classification label from a pytorch model?

the main thing is that you have to reduce/collapse the dimension where the classification raw value/logit is with a max and then select it with a .indices. Usually this is dimensions 1 since dim 0 has the batch size e.g. [batch_size,D_classification] where the raw data might of size [batch_size,C,H,W]

A synthetic example with raw data in 1D as follows:

import torch
import torch.nn as nn

# data dimension [batch-size, D]
D, Dout = 1, 5
batch_size = 16
x = torch.randn(batch_size, D)
y = torch.randint(low=0,high=Dout,size=(batch_size,))

mdl = nn.Linear(D, Dout)
logits = mdl(x)
print(f'y.size() = {y.size()}')
# removes the 1th dimension with a max, which is the classification layer
# which means it returns the most likely label. Also, note you need to choose .indices since you want to return the
# position of where the most likely label is (not it's raw logit value)
pred = logits.max(1).indices

print('--- preds vs truth ---')
print(f'predictions = {pred}')
print(f'y = {y}')

acc = (pred == y).sum().item() / pred.size(0)


y.size() = torch.Size([16])
tensor([3, 1, 1, 3, 4, 1, 4, 3, 1, 1, 4, 4, 4, 4, 3, 1])
--- preds vs truth ---
predictions = tensor([3, 1, 1, 3, 4, 1, 4, 3, 1, 1, 4, 4, 4, 4, 3, 1])
y = tensor([3, 3, 3, 0, 3, 4, 0, 1, 1, 2, 1, 4, 4, 2, 0, 0])


1 Like