About cpu training speed

I have tried building pytorch from source. I print torch.config.show() and the ‘MKLDNN’ is on. For my model, training one epoch needs 5min.

However, I also tried conda install pytorch 1.1. I also print torch.config.show() and the ‘MKLDNN’ is also on. But for the same model, training one epoch needs 8min.

How can I speed up training on CPU when I conda install pytorch?