Model giving different test accuracies for different batch sizes

I am running the official code for the pyramidal vision transformer on CIFAR-10.

For a batchsize of 128, my test accuracy starts at 40%,

but for a batch size of 4, my model starts at a test accuracy of 44-45%.

Here’s a high level visualization of the PVT model for a quick understanding:

I have attached my colab code below.

The code for the model has been directly taken from the official PVT code:

I am not sure why I am getting different accuracies for different batch sizes in Pytorch. I have converted the code to keras and am getting the same test accuracy for different batchsizes in keras, so I’m not sure where I’m going wrong. Will be glad if someone could help me with this! Thanks!

Changing the validation batch size alone should not change the validation accuracy.
However, in your current code you are changing the training and validation batch sizes together, which is expected to potentially change the accuracy, since different training batch sizes would change the convergence of the model.
I don’t know why you are not seeing the same issue in Keras, but it’s surprising to see the same accuracy for different training batch sizes.

Thanks for the reply! I’ll look into my keras code just to see if I am missing out on anything. Also, In my keras model, the model accuracy is stagnating after 10-15 epochs. Is there anything I can try to ensure I am getting the same accuracy as the pytorch model?