# Confusing and unpredictable behavior while calculating accuracy

I have built a model and overfit the model for a small batch of 16 samples from the training set. When I calculate the accuracy for this same batch, I expect 100% accuracy. I have used two different methods to calculate accuracy and both of them give such different results.

My forward pass function:

``````def fwd_pass(X, y, train=False):
if not train:
outputs = model(X)
matches = (torch.argmax (outputs, dim = 1) == y).sum()

acc = matches/len(y)

loss = loss_fn(outputs, y)

return acc, loss

outputs = model(X)
matches = (torch.argmax (outputs, dim = 1) == y).sum()

acc = matches/len(y)

loss = loss_fn(outputs, y)

loss.backward()
opt.step()

return acc, loss
``````

My train function:

``````def train(net, epochs, batch_size, X, y, val_X=None, val_y=None):
accuracies = []
losses = []
val_accuracies = []
val_losses = []

for ep in tqdm(range(epochs)):
for i in tqdm(range(0, len(X), batch_size)):
batch_X = X[i:i+batch_size].to(device)
batch_y = y[i:i+batch_size].to(device)

acc, loss = fwd_pass(batch_X.float(), batch_y, train=True)

torch.cuda.empty_cache()

if val_X != None and val_y != None:
val_acc, val_loss = fwd_pass(val_X.to(device), val_y.to(device))

print(f'Ep: {ep+1} Acc: {round(float(acc), 5)} Loss: {round(float(loss), 5)}')

if val_X != None and val_y != None:
print(f'Val Acc: {round(float(val_acc), 5)} Val Loss: {round(float(val_loss), 5)}')

accuracies.append(acc)
losses.append(loss)

if val_X != None and val_y != None:
val_accuracies.append(val_acc)
val_losses.append(val_loss)

return accuracies, losses, val_accuracies, val_losses
``````

Now here, I calculate the accuracies:

``````corr = 0
tot = 0

for i in range(len(batch_1_X)):
op = model(batch_1_X[i].view(1, 100, 8).transpose(1, 2).to(device).float())
pred = torch.argmax(op)
real = batch_1_y[i]

if pred == real:
corr += 1

tot += 1

print(corr)
print(tot)
``````

The above gives me the following output:

``````3
16
``````

suggesting that only 3 out of 16 have been predicted correctly, even though the model has been overfit.

Now, this method:

``````corr = 0
tot = 0

op = model(batch_1_X.transpose(1, 2).to(device).float())
preds = torch.argmax(op, dim=1)
for p, r in zip(preds, batch_1_y):
if p == r:
corr += 1

tot += 1

print(corr)
print(tot)
``````

gives me the output

``````16
16
``````

suggesting that all predictions are correct

``````corr = 0
tot = 0
num = 16

op = model(batch_1_X[0:num].transpose(1, 2).to(device).float())
preds = torch.argmax(op, dim=1)
for p, r in zip(preds, batch_1_y[0:num]):
if p == r:
corr += 1

tot += 1

print(corr)
print(tot)
``````

When `num` is equal to 16, the `corr` and `tot` values are 16 and 16. When `num` is equal to something like 1, 3, 4, 5, 6, 7, 8, 9,…, the `corr` and `tot` values are equal. But when `num` is 2, `corr` is 1 and `tot` is 2 suggesting that the model got only 1 of the 2 predictions right.

What is the mistake I am making that is giving me this unpredictable behaivior?

Did you make sure to call `model.eval()` before calculating the accuracy?
Also, I would double check the `view` and `transpose` operation and make sure all different batch sizes use the correct logic.

Instead of model.eval(), I have used all code inside a torch.no_grad() block. With regards to the shapes of input tensors, I checked for the correct shape beforehand and they matched correctly. Are you suggesting that there might be a problem with the values while transposing the tensor itself?

Note that these calls do not perform the same operations.
`model.eval()` changes the behavior of some layers (e.g. batchnorm layers will use their running stats and dropout will be disabled), while `no_grad()` disallows the gradient calculation and saves memory by not storing intermediate tensors, so you cannot use one instead of the other.

Here is my code with model.train():

``````corr = 0
tot = 0

model.train()

for i in range(len(batch_1_X)):
op = model(batch_1_X[i].view(-1, 100, 8).transpose(1, 2).to(device).float())
pred = torch.argmax(op)
real = batch_1_y[i]

if pred == real:
corr += 1

tot += 1

print(corr)
print(tot)

print("-"*20)

corr = 0
tot = 0

op = model(batch_1_X.transpose(1, 2).to(device).float())
preds = torch.argmax(op, dim=1)
#     print(batch_1_y)
#     print(preds)
for p, r in zip(preds, batch_1_y):
if p == r:
corr += 1

tot += 1

print(corr)
print(tot)

``````

This gives me the output:

``````3
16
--------------------
16
16
``````

Switching it to model.eval():

``````corr = 0
tot = 0

model.eval()

for i in range(len(batch_1_X)):
op = model(batch_1_X[i].view(-1, 100, 8).transpose(1, 2).to(device).float())
pred = torch.argmax(op)
real = batch_1_y[i]

if pred == real:
corr += 1

tot += 1

print(corr)
print(tot)

print("-"*20)

corr = 0
tot = 0

op = model(batch_1_X.transpose(1, 2).to(device).float())
preds = torch.argmax(op, dim=1)
#     print(batch_1_y)
#     print(preds)
for p, r in zip(preds, batch_1_y):
if p == r:
corr += 1

tot += 1

print(corr)
print(tot)
``````

This gives the output:

``````8
16
--------------------
11
16
``````

You were right @ptrblck, model.eval() does procure different results! But I did overfit the model to 100% accuracy for training data and while evaluating on the same training data I am not getting a 100% accuracy. Even if I try to measure the accuracy using two different methods, it gives me differing results. Any suggestions would help!

Hey @ptrblck ! A follow up to this, it was a really silly mistake! In the model definition, I had 2 LSTM layers. For the first, I mentioned `batch_first=True` but for the second, I had forgotten. I have had this code for months and never thought to re write the model definition. Here is the corrected model with `batch_first=True` for the second layer as well:

``````class TFModel(nn.Module):
def __init__(self):
super().__init__()

self.conv1 = nn.Conv1d(8, 16, kernel_size=8)
self.conv2 = nn.Conv1d(16, 32, kernel_size=8)
self.conv3 = nn.Conv1d(32, 64, kernel_size=8)

self.bn1 = nn.BatchNorm1d(64)  # after pooling of 2

self.conv4 = nn.Conv1d(64, 64, kernel_size=8)
self.conv5 = nn.Conv1d(64, 128, kernel_size=8)

self.bn2 = nn.BatchNorm1d(128)  # after pooling of 2

#         self.flat = nn.Flatten()

self.lstm1 = nn.LSTM(128, 100, batch_first=True)
self.lstm2 = nn.LSTM(100, 128, batch_first=True)

self.fc1 = nn.Linear(128, 64)
self.fc2 = nn.Linear(64, 32)
self.fc3 = nn.Linear(32, classes)

def exec_conv_block(self, x):
x = F.relu(self.conv1(x))
x = F.relu(self.conv2(x))
x = F.relu(self.conv3(x))

x = F.max_pool1d(x, 2)
#         x = self.bn1(x)

x = F.relu(self.conv4(x))
x = F.relu(self.conv5(x))

x = F.max_pool1d(x, 2)
#         x = self.bn2(x)

return x

def forward(self, x):
x = self.exec_conv_block(x)

x, _ = self.lstm1(x.transpose(1, 2))
x, _ = self.lstm2(x)

x = x[:, -1, :]

x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)

return x
``````