Model giving different test accuracies for different batch sizes

DL_aspirant · July 8, 2021, 6:46pm

I am running the official code for the pyramidal vision transformer on CIFAR-10.

For a batchsize of 128, my test accuracy starts at 40%,

but for a batch size of 4, my model starts at a test accuracy of 44-45%.

Here’s a high level visualization of the PVT model for a quick understanding:

I have attached my colab code below.

The code for the model has been directly taken from the official PVT code:

github.com

whai362/PVT/blob/962734be5c95d61bce809c252caa06d00618b26a/classification/pvt_v2.py

import torch
import torch.nn as nn
import torch.nn.functional as F
from functools import partial

from timm.models.layers import DropPath, to_2tuple, trunc_normal_
from timm.models.registry import register_model
from timm.models.vision_transformer import _cfg
import math


class Mlp(nn.Module):
    def __init__(self, in_features, hidden_features=None, out_features=None, act_layer=nn.GELU, drop=0., linear=False):
        super().__init__()
        out_features = out_features or in_features
        hidden_features = hidden_features or in_features
        self.fc1 = nn.Linear(in_features, hidden_features)
        self.dwconv = DWConv(hidden_features)
        self.act = act_layer()
        self.fc2 = nn.Linear(hidden_features, out_features)

This file has been truncated. show original

I am not sure why I am getting different accuracies for different batch sizes in Pytorch. I have converted the code to keras and am getting the same test accuracy for different batchsizes in keras, so I’m not sure where I’m going wrong. Will be glad if someone could help me with this! Thanks!

ptrblck · July 13, 2021, 5:33am

Changing the validation batch size alone should not change the validation accuracy.
However, in your current code you are changing the training and validation batch sizes together, which is expected to potentially change the accuracy, since different training batch sizes would change the convergence of the model.
I don’t know why you are not seeing the same issue in Keras, but it’s surprising to see the same accuracy for different training batch sizes.

DL_aspirant · July 13, 2021, 5:34pm

Thanks for the reply! I’ll look into my keras code just to see if I am missing out on anything. Also, In my keras model, the model accuracy is stagnating after 10-15 epochs. Is there anything I can try to ensure I am getting the same accuracy as the pytorch model?