Different speed of models when loading from snapshot and when randomly initializing weights

nasretdinovr · July 15, 2022, 7:49am

I have 2 models with the same architecture. The first model is with trained weights and the second is with all weights (including biases) from normal distribution. And the second one is 50% faster than the first. I measured time on CPU.
I even used torch.profiler and realized that individual operations like conv2d are faster in the 2nd case. Is it normal behavior for a model?

ptrblck · July 15, 2022, 8:38am

Could you check if changing torch.set_flush_denormal would speed up the slower model?

nasretdinovr · July 15, 2022, 9:22am

I checked, it haven’t changed anything.

ptrblck · July 15, 2022, 4:28pm

Unfortunately, I wouldn’t know what else might cause a performance difference on the CPU for exactly the same models using different parameter values.