I have trained a RegNet model on a custom dataset for an image classification task. That was in August 2023. Now I want to train exactly the same model again, using the same dataset. I would expect this new model to achieve about the same performance as the previous one from August 2023, since nothing has changed:
- I use exactly the same PyTorch and Torchvision versions (1.13 and 0.14)
- I use exactly the same image dataset for training/validation/test
- I use exactly the same script to train the model via torch
- And I use exactly the same training hyperparams as before
However, even though nothing has changed, the newly trained model performs significantly worse the the original model from last year. Where the first model from august 2023 achieves an test accuracy of 0.97, now the new model only achives 0.94 on the very same test dataset. During training the train and validation accuracy is about the same though as before.
I understand that two models will not achieve exactly the same performance, but three % difference seems too much. Whatever I do, I cannot get close to these 0.97 test accuracy from last year, about 0.94 is all I get. Even though everything is exactly the same, as described. Even the machine with its four GPUs and the Ubuntu version running on that machine are exactly the same as before in 2023.
I know there is a random seed involved, but I doubt that could lead to such a large test accuracy difference of 3%. Also I know that maybe Nvidia / CUDA driver has been updated on that machine, and of course some dependencies and packages (e.g. numpy). But can that lead to such a huge difference?