I have a set of PyTorch classification models. I trained them with the same parameters on two different datasets:
- A first time around 5 months ago with
torch==1.9.1
- A second time now with
torch==1.12.1
I tested inference for both models inside the same environment with torch==1.12.1
and somehow the newly trained models have double the latency of the older ones (15ms to 30ms).
It doesn’t seem to be the version with which I train the models as I tested a retrain with version 1.9.1 and got a slower model there too.
Specifically, I looked into one of the models with a combination of 1D convolutional, LSTM, and linear layers. Profiling this model on both trainings I saw that mostly the convolution and lstm operations got much slower. The weights have comparable averages, but the older ones have a bigger standard deviation (2 to 10x).
Another strange thing I noticed, is when retraining on the new dataset for a single epoch I am getting the same latency as the previous training while the code for model training hasn’t been updated between the two.
I there anything I might be missing, or forgot to upgrade when upgrading the torch packages?