Libtorch inference on CPU about 5 times slower than python version

So am using pytorch to make inference (using Resnet50), I have been constrained to run it on CPU - so I have no option.

To my surprise and confusion the python version makes inference in ~0.25 secs. However the libtorch version takes ~0.7 secs.

Please suggestions will be appreciated as am really new to this, and please GPU is a no no in this scenario we are specifically asked to do it on CPU.

There are three typical causes of such a thing:

  • you made a debug build, or one without avx,
  • you are missing some back end library in libtorch, or
  • the threading setup is missing.

The first two won’t happen when you use libtorch from the PyTorch installation.

For the third, you can debug this by looking at top while running your programming to see if CPU goes over 100%.

Best regards


You are right my build had an issue.

1 Like

In my case, I tested the pretrained ResNet34 with both PyTorch and libtorch, I use 400 images, the the preprocess is the same.
On my MAC:
PyTorch used 34s, CPU usage > 90%;
libtorch used 73s, CPU usage < 40%.

Why did this happen?.
The libtorch is download from the official website

You may have a look at PyTorch vs LibTorch:网络推理速度谁更快? - 知乎 and libtorch性能问题调研 · Issue #30 · DeepVAC/libdeepvac · GitHub