Different speed of models when loading from snapshot and when randomly initializing weights

I have 2 models with the same architecture. The first model is with trained weights and the second is with all weights (including biases) from normal distribution. And the second one is 50% faster than the first. I measured time on CPU.
I even used torch.profiler and realized that individual operations like conv2d are faster in the 2nd case. Is it normal behavior for a model?

Could you check if changing torch.set_flush_denormal would speed up the slower model?

I checked, it haven’t changed anything.

Unfortunately, I wouldn’t know what else might cause a performance difference on the CPU for exactly the same models using different parameter values.