Translate pytorch code to libtorch is ~6x slower

Hi, I translate code from PyTorch to libtorch, in the main code used some Numpy and Librosa functions and I convert these methods to cpp libtorch, really in my new code all the thing for calculating is Tensor, but the performance of new code in c++ is slower than main PyTorch code. (two codes test in ubuntu 16.04 and CPU),
why did this happen? need i’m used some blace or Lapack or Ttb for optimized my code.
convert Numpy to tensor can be the reason for slow performance?