Torch_cpu.dll is too big :196 MB

Hello,

Currently when downloading libtorch 1.8.1 (libtorch-win-shared-with-deps-1.8.1+cpu.zip), torch_cpu.dll is about 200 MB which is quite huge.

We don’t need Caffe2 nor anything to create a model from the c++ side, we just want to do the inference. Basically the code is :

torch::jit::script::Module l_module = torch::jit::load( "TheModelPath" );
...
at::Tensor output = l_module.forward( l_inputs ).toTensor();

And that’s all.

I have built the 1.8.1 from source on Visual Studio 2019 without Caffe2, the new dll is about 68 MB which is much better, but the problem is that when we run the inference with that version the inference is about 3 to 4 times slower.

I’m probably missing some options, can somebody help me on creating a lightweight dll that is still sharp and fast as the original one coming from the website ?

Here is cmake output after the configuration :

CMake Warning at CMakeLists.txt:273 (message):
  TensorPipe cannot be used on Windows.  Set it to OFF


std::exception_ptr is supported.
Current compiler supports avx2 extension. Will build perfkernels.
Current compiler supports avx512f extension. Will build fbgemm.
Building using own protobuf under third_party per request.
Use custom protobuf build.

3.11.4.0
Caffe2 protobuf include directory: $<BUILD_INTERFACE:C:/Dev/pytorch-1.8/pytorch/third_party/protobuf/src>$<INSTALL_INTERFACE:include>
Trying to find preferred BLAS backend of choice: MKL
MKL_THREADING = OMP
MKL_THREADING = OMP
CMake Warning at cmake/Dependencies.cmake:152 (message):
  MKL could not be found.  Defaulting to Eigen
Call Stack (most recent call first):
  CMakeLists.txt:564 (include)


CMake Warning at cmake/Dependencies.cmake:175 (message):
  Preferred BLAS (MKL) cannot be found, now searching for a general BLAS
  library
Call Stack (most recent call first):
  CMakeLists.txt:564 (include)


MKL_THREADING = OMP
Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core - libiomp5md]
  Library mkl_intel_lp64: not found
Checking for [mkl_intel - mkl_intel_thread - mkl_core - libiomp5md]
  Library mkl_intel: not found
Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core]
  Library mkl_intel_lp64: not found
Checking for [mkl_intel - mkl_intel_thread - mkl_core]
  Library mkl_intel: not found
Checking for [mkl_intel_lp64 - mkl_sequential - mkl_core]
  Library mkl_intel_lp64: not found
Checking for [mkl_intel - mkl_sequential - mkl_core]
  Library mkl_intel: not found
Checking for [mkl_intel_lp64 - mkl_core - libiomp5md - pthread]
  Library mkl_intel_lp64: not found
Checking for [mkl_intel - mkl_core - libiomp5md - pthread]
  Library mkl_intel: not found
Checking for [mkl_intel_lp64 - mkl_core - pthread]
  Library mkl_intel_lp64: not found
Checking for [mkl_intel - mkl_core - pthread]
  Library mkl_intel: not found
Checking for [mkl - guide - pthread - m]
  Library mkl: not found
MKL library not found
Checking for [Accelerate]
  Library Accelerate: BLAS_Accelerate_LIBRARY-NOTFOUND
Checking for [vecLib]
  Library vecLib: BLAS_vecLib_LIBRARY-NOTFOUND
Could not find OpenBLAS include. Turning OpenBLAS_FOUND off
Could not find OpenBLAS lib. Turning OpenBLAS_FOUND off
Checking for [openblas]
  Library openblas: BLAS_openblas_LIBRARY-NOTFOUND
Checking for [openblas - pthread - m]
  Library openblas: BLAS_openblas_LIBRARY-NOTFOUND
Checking for [libopenblas]
  Library libopenblas: BLAS_libopenblas_LIBRARY-NOTFOUND
Checking for [goto2 - gfortran]
  Library goto2: BLAS_goto2_LIBRARY-NOTFOUND
Checking for [goto2 - gfortran - pthread]
  Library goto2: BLAS_goto2_LIBRARY-NOTFOUND
Checking for [acml - gfortran]
  Library acml: BLAS_acml_LIBRARY-NOTFOUND
Checking for [blis]
  Library blis: BLAS_blis_LIBRARY-NOTFOUND
Could NOT find Atlas (missing: Atlas_CBLAS_INCLUDE_DIR Atlas_CLAPACK_INCLUDE_DIR Atlas_CBLAS_LIBRARY Atlas_BLAS_LIBRARY Atlas_LAPACK_LIBRARY) 
Checking for [ptf77blas - atlas - gfortran]
  Library ptf77blas: BLAS_ptf77blas_LIBRARY-NOTFOUND
Checking for [blas]
  Library blas: BLAS_blas_LIBRARY-NOTFOUND
Cannot find a library with BLAS API. Not using BLAS.
CMake Warning (dev) at C:/Program Files/CMake/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:273 (message):
  The package name passed to `find_package_handle_standard_args` (OpenMP_C)
  does not match the name of the calling package (OpenMP).  This can lead to
  problems in calling code that expects `find_package` result variables
  (e.g., `_FOUND`) to follow a certain pattern.
Call Stack (most recent call first):
  cmake/Modules/FindOpenMP.cmake:565 (find_package_handle_standard_args)
  third_party/fbgemm/CMakeLists.txt:59 (find_package)
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) at C:/Program Files/CMake/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:273 (message):
  The package name passed to `find_package_handle_standard_args` (OpenMP_CXX)
  does not match the name of the calling package (OpenMP).  This can lead to
  problems in calling code that expects `find_package` result variables
  (e.g., `_FOUND`) to follow a certain pattern.
Call Stack (most recent call first):
  cmake/Modules/FindOpenMP.cmake:565 (find_package_handle_standard_args)
  third_party/fbgemm/CMakeLists.txt:59 (find_package)
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning at third_party/fbgemm/CMakeLists.txt:61 (message):
  OpenMP found! OpenMP_C_INCLUDE_DIRS =


CMake Warning at third_party/fbgemm/CMakeLists.txt:136 (message):
  ==========


CMake Warning at third_party/fbgemm/CMakeLists.txt:137 (message):
  CMAKE_BUILD_TYPE = Release


CMake Warning at third_party/fbgemm/CMakeLists.txt:138 (message):
  CMAKE_CXX_FLAGS_DEBUG is /MDd /Z7 /Ob0 /Od /RTC1 /w /MP /bigobj


CMake Warning at third_party/fbgemm/CMakeLists.txt:139 (message):
  CMAKE_CXX_FLAGS_RELEASE is /MD /O2 /Ob2 /DNDEBUG /w /MP /bigobj


CMake Warning at third_party/fbgemm/CMakeLists.txt:140 (message):
  ==========


** AsmJit Summary **
   ASMJIT_DIR=C:/Dev/pytorch-1.8/pytorch/third_party/fbgemm/third_party/asmjit
   ASMJIT_TEST=OFF
   ASMJIT_TARGET_TYPE=SHARED
   ASMJIT_DEPS=
   ASMJIT_LIBS=asmjit
   ASMJIT_CFLAGS=
   ASMJIT_PRIVATE_CFLAGS=-MP;-GR-;-GF;-Zc:inline;-Zc:strictStrings;-Zc:threadSafeInit-;-W4
   ASMJIT_PRIVATE_CFLAGS_DBG=-GS
   ASMJIT_PRIVATE_CFLAGS_REL=-GS-;-O2;-Oi
Using third party subdirectory Eigen.
Using third_party/pybind11.
pybind11 include dirs: C:/Dev/pytorch-1.8/pytorch/cmake/../third_party/pybind11/include
Adding OpenMP CXX_FLAGS: -openmp:experimental
No OpenMP library needs to be linked against
Generated: C:/Dev/pytorch-1.8/pytorch/build/third_party/onnx/onnx/onnx_onnx_torch-ml.proto
Generated: C:/Dev/pytorch-1.8/pytorch/build/third_party/onnx/onnx/onnx-operators_onnx_torch-ml.proto
Generated: C:/Dev/pytorch-1.8/pytorch/build/third_party/onnx/onnx/onnx-data_onnx_torch.proto

******** Summary ********
  CMake version         : 3.18.4
  CMake command         : C:/Program Files/CMake/bin/cmake.exe
  System                : Windows
  C++ compiler          : C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.28.29910/bin/Hostx64/x64/cl.exe
  C++ compiler version  : 19.28.29913.0
  CXX flags             : /DWIN32 /D_WINDOWS /GR /EHsc /w /MP /bigobj -DUSE_PTHREADPOOL -openmp:experimental
  Build type            : Release
  Compile definitions   : WIN32_LEAN_AND_MEAN;ONNX_ML=1;ONNXIFI_ENABLE_EXT=1
  CMAKE_PREFIX_PATH     : 
  CMAKE_INSTALL_PREFIX  : C:/Program Files (x86)/Torch
  CMAKE_MODULE_PATH     : C:/Dev/pytorch-1.8/pytorch/cmake/Modules

  ONNX version          : 1.8.0
  ONNX NAMESPACE        : onnx_torch
  ONNX_BUILD_TESTS      : OFF
  ONNX_BUILD_BENCHMARKS : OFF
  ONNX_USE_LITE_PROTO   : OFF
  ONNXIFI_DUMMY_BACKEND : OFF
  ONNXIFI_ENABLE_EXT    : OFF

  Protobuf compiler     : 
  Protobuf includes     : 
  Protobuf libraries    : 
  BUILD_ONNX_PYTHON     : OFF

******** Summary ********
  CMake version         : 3.18.4
  CMake command         : C:/Program Files/CMake/bin/cmake.exe
  System                : Windows
  C++ compiler          : C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.28.29910/bin/Hostx64/x64/cl.exe
  C++ compiler version  : 19.28.29913.0
  CXX flags             : /DWIN32 /D_WINDOWS /GR /EHsc /w /MP /bigobj -DUSE_PTHREADPOOL -openmp:experimental
  Build type            : Release
  Compile definitions   : WIN32_LEAN_AND_MEAN;ONNX_ML=1;ONNXIFI_ENABLE_EXT=1
  CMAKE_PREFIX_PATH     : 
  CMAKE_INSTALL_PREFIX  : C:/Program Files (x86)/Torch
  CMAKE_MODULE_PATH     : C:/Dev/pytorch-1.8/pytorch/cmake/Modules

  ONNX version          : 1.4.1
  ONNX NAMESPACE        : onnx_torch
  ONNX_BUILD_TESTS      : OFF
  ONNX_BUILD_BENCHMARKS : OFF
  ONNX_USE_LITE_PROTO   : OFF
  ONNXIFI_DUMMY_BACKEND : OFF

  Protobuf compiler     : 
  Protobuf includes     : 
  Protobuf libraries    : 
  BUILD_ONNX_PYTHON     : OFF
Could not find CUDA with FP16 support, compiling without torch.CudaHalfTensor
Adding -DNDEBUG to compile flags
MAGMA not found. Compiling without MAGMA support
Could not find hardware support for NEON on this machine.
No OMAP3 processor on this machine.
No OMAP4 processor on this machine.
AVX compiler support found
AVX2 compiler support found
MKL_THREADING = OMP
Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core - libiomp5md]
  Library mkl_intel_lp64: not found
Checking for [mkl_intel - mkl_intel_thread - mkl_core - libiomp5md]
  Library mkl_intel: not found
Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core]
  Library mkl_intel_lp64: not found
Checking for [mkl_intel - mkl_intel_thread - mkl_core]
  Library mkl_intel: not found
Checking for [mkl_intel_lp64 - mkl_sequential - mkl_core]
  Library mkl_intel_lp64: not found
Checking for [mkl_intel - mkl_sequential - mkl_core]
  Library mkl_intel: not found
Checking for [mkl_intel_lp64 - mkl_core - libiomp5md - pthread]
  Library mkl_intel_lp64: not found
Checking for [mkl_intel - mkl_core - libiomp5md - pthread]
  Library mkl_intel: not found
Checking for [mkl_intel_lp64 - mkl_core - pthread]
  Library mkl_intel_lp64: not found
Checking for [mkl_intel - mkl_core - pthread]
  Library mkl_intel: not found
Checking for [mkl - guide - pthread - m]
  Library mkl: not found
MKL library not found
Checking for [Accelerate]
  Library Accelerate: BLAS_Accelerate_LIBRARY-NOTFOUND
Checking for [vecLib]
  Library vecLib: BLAS_vecLib_LIBRARY-NOTFOUND
Could not find OpenBLAS include. Turning OpenBLAS_FOUND off
Could not find OpenBLAS lib. Turning OpenBLAS_FOUND off
Checking for [openblas]
  Library openblas: BLAS_openblas_LIBRARY-NOTFOUND
Checking for [openblas - pthread - m]
  Library openblas: BLAS_openblas_LIBRARY-NOTFOUND
Checking for [libopenblas]
  Library libopenblas: BLAS_libopenblas_LIBRARY-NOTFOUND
Checking for [goto2 - gfortran]
  Library goto2: BLAS_goto2_LIBRARY-NOTFOUND
Checking for [goto2 - gfortran - pthread]
  Library goto2: BLAS_goto2_LIBRARY-NOTFOUND
Checking for [acml - gfortran]
  Library acml: BLAS_acml_LIBRARY-NOTFOUND
Checking for [blis]
  Library blis: BLAS_blis_LIBRARY-NOTFOUND
Could NOT find Atlas (missing: Atlas_CBLAS_INCLUDE_DIR Atlas_CLAPACK_INCLUDE_DIR Atlas_CBLAS_LIBRARY Atlas_BLAS_LIBRARY Atlas_LAPACK_LIBRARY) 
Checking for [ptf77blas - atlas - gfortran]
  Library ptf77blas: BLAS_ptf77blas_LIBRARY-NOTFOUND
Checking for [blas]
  Library blas: BLAS_blas_LIBRARY-NOTFOUND
Cannot find a library with BLAS API. Not using BLAS.
LAPACK requires BLAS
Cannot find a library with LAPACK API. Not using LAPACK.
disabling CUDA because NOT USE_CUDA is set
USE_CUDNN is set to 0. Compiling without cuDNN support
disabling ROCM because NOT USE_ROCM is set
MIOpen not found. Compiling without MIOpen support
MKL_THREADING = OMP
Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core - libiomp5md]
  Library mkl_intel_lp64: not found
Checking for [mkl_intel - mkl_intel_thread - mkl_core - libiomp5md]
  Library mkl_intel: not found
Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core]
  Library mkl_intel_lp64: not found
Checking for [mkl_intel - mkl_intel_thread - mkl_core]
  Library mkl_intel: not found
Checking for [mkl_intel_lp64 - mkl_sequential - mkl_core]
  Library mkl_intel_lp64: not found
Checking for [mkl_intel - mkl_sequential - mkl_core]
  Library mkl_intel: not found
Checking for [mkl_intel_lp64 - mkl_core - libiomp5md - pthread]
  Library mkl_intel_lp64: not found
Checking for [mkl_intel - mkl_core - libiomp5md - pthread]
  Library mkl_intel: not found
Checking for [mkl_intel_lp64 - mkl_core - pthread]
  Library mkl_intel_lp64: not found
Checking for [mkl_intel - mkl_core - pthread]
  Library mkl_intel: not found
Checking for [mkl - guide - pthread - m]
  Library mkl: not found
MKL library not found
Checking for [Accelerate]
  Library Accelerate: BLAS_Accelerate_LIBRARY-NOTFOUND
Checking for [vecLib]
  Library vecLib: BLAS_vecLib_LIBRARY-NOTFOUND
Could not find OpenBLAS include. Turning OpenBLAS_FOUND off
Could not find OpenBLAS lib. Turning OpenBLAS_FOUND off
Checking for [openblas]
  Library openblas: BLAS_openblas_LIBRARY-NOTFOUND
Checking for [openblas - pthread - m]
  Library openblas: BLAS_openblas_LIBRARY-NOTFOUND
Checking for [libopenblas]
  Library libopenblas: BLAS_libopenblas_LIBRARY-NOTFOUND
Checking for [goto2 - gfortran]
  Library goto2: BLAS_goto2_LIBRARY-NOTFOUND
Checking for [goto2 - gfortran - pthread]
  Library goto2: BLAS_goto2_LIBRARY-NOTFOUND
Checking for [acml - gfortran]
  Library acml: BLAS_acml_LIBRARY-NOTFOUND
Checking for [blis]
  Library blis: BLAS_blis_LIBRARY-NOTFOUND
Could NOT find Atlas (missing: Atlas_CBLAS_INCLUDE_DIR Atlas_CLAPACK_INCLUDE_DIR Atlas_CBLAS_LIBRARY Atlas_BLAS_LIBRARY Atlas_LAPACK_LIBRARY) 
Checking for [ptf77blas - atlas - gfortran]
  Library ptf77blas: BLAS_ptf77blas_LIBRARY-NOTFOUND
Checking for [blas]
  Library blas: BLAS_blas_LIBRARY-NOTFOUND
Cannot find a library with BLAS API. Not using BLAS.
MKLDNN_CPU_RUNTIME = OMP
CMake Warning (dev) at C:/Program Files/CMake/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:273 (message):
  The package name passed to `find_package_handle_standard_args` (OpenMP_C)
  does not match the name of the calling package (OpenMP).  This can lead to
  problems in calling code that expects `find_package` result variables
  (e.g., `_FOUND`) to follow a certain pattern.
Call Stack (most recent call first):
  cmake/Modules/FindOpenMP.cmake:565 (find_package_handle_standard_args)
  third_party/ideep/mkl-dnn/cmake/OpenMP.cmake:61 (find_package)
  third_party/ideep/mkl-dnn/CMakeLists.txt:118 (include)
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) at C:/Program Files/CMake/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:273 (message):
  The package name passed to `find_package_handle_standard_args` (OpenMP_CXX)
  does not match the name of the calling package (OpenMP).  This can lead to
  problems in calling code that expects `find_package` result variables
  (e.g., `_FOUND`) to follow a certain pattern.
Call Stack (most recent call first):
  cmake/Modules/FindOpenMP.cmake:565 (find_package_handle_standard_args)
  third_party/ideep/mkl-dnn/cmake/OpenMP.cmake:61 (find_package)
  third_party/ideep/mkl-dnn/CMakeLists.txt:118 (include)
This warning is for project developers.  Use -Wno-dev to suppress it.

GPU support is disabled
Primitive cache is enabled
Found MKL-DNN: TRUE
Version: 7.0.3
Build type: Release
CXX_STANDARD: 14
Required features: cxx_variadic_templates
Not using libkineto in a non-CUDA build.
Could NOT find Backtrace (missing: Backtrace_LIBRARY Backtrace_INCLUDE_DIR) 
don't use NUMA
Using ATen parallel backend: OMP
disabling CUDA because USE_CUDA is set false
AT_INSTALL_INCLUDE_DIR include/ATen/core
core header install: C:/Dev/pytorch-1.8/pytorch/build/aten/src/ATen/core/TensorBody.h
CMake Warning (dev) at C:/Program Files/CMake/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:273 (message):
  The package name passed to `find_package_handle_standard_args` (OpenMP_C)
  does not match the name of the calling package (OpenMP).  This can lead to
  problems in calling code that expects `find_package` result variables
  (e.g., `_FOUND`) to follow a certain pattern.
Call Stack (most recent call first):
  cmake/Modules/FindOpenMP.cmake:565 (find_package_handle_standard_args)
  caffe2/CMakeLists.txt:1020 (find_package)
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) at C:/Program Files/CMake/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:273 (message):
  The package name passed to `find_package_handle_standard_args` (OpenMP_CXX)
  does not match the name of the calling package (OpenMP).  This can lead to
  problems in calling code that expects `find_package` result variables
  (e.g., `_FOUND`) to follow a certain pattern.
Call Stack (most recent call first):
  cmake/Modules/FindOpenMP.cmake:565 (find_package_handle_standard_args)
  caffe2/CMakeLists.txt:1020 (find_package)
This warning is for project developers.  Use -Wno-dev to suppress it.

pytorch is compiling with OpenMP. 
OpenMP CXX_FLAGS: -openmp:experimental. 
OpenMP libraries: .
Caffe2 is compiling with OpenMP. 
OpenMP CXX_FLAGS: -openmp:experimental. 
OpenMP libraries: .
CMake Warning at CMakeLists.txt:854 (message):
  Generated cmake files are only fully tested if one builds with system glog,
  gflags, and protobuf.  Other settings may generate files that are not well
  tested.



******** Summary ********
General:
  CMake version         : 3.18.4
  CMake command         : C:/Program Files/CMake/bin/cmake.exe
  System                : Windows
  C++ compiler          : C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.28.29910/bin/Hostx64/x64/cl.exe
  C++ compiler id       : MSVC
  C++ compiler version  : 19.28.29913.0
  CXX flags             : /DWIN32 /D_WINDOWS /GR /EHsc /w /MP /bigobj -DUSE_PTHREADPOOL -openmp:experimental -DNDEBUG -DUSE_FBGEMM -DUSE_XNNPACK
  Build type            : Release
  Compile definitions   : WIN32_LEAN_AND_MEAN;ONNX_ML=1;ONNXIFI_ENABLE_EXT=1;ONNX_NAMESPACE=onnx_torch;_CRT_SECURE_NO_DEPRECATE=1;USE_EXTERNAL_MZCRC;MINIZ_DISABLE_ZIP_READER_CRC32_CHECKS
  CMAKE_PREFIX_PATH     : 
  CMAKE_INSTALL_PREFIX  : C:/Program Files (x86)/Torch

  TORCH_VERSION         : 1.8.0
  CAFFE2_VERSION        : 1.8.0
  BUILD_CAFFE2          : OFF
  BUILD_CAFFE2_OPS      : OFF
  BUILD_CAFFE2_MOBILE   : OFF
  BUILD_STATIC_RUNTIME_BENCHMARK: OFF
  BUILD_TENSOREXPR_BENCHMARK: OFF
  BUILD_BINARY          : OFF
  BUILD_CUSTOM_PROTOBUF : ON
    Link local protobuf : ON
  BUILD_DOCS            : OFF
  BUILD_PYTHON          : OFF
  BUILD_SHARED_LIBS     : ON
  CAFFE2_USE_MSVC_STATIC_RUNTIME     : OFF
  BUILD_TEST            : OFF
  BUILD_JNI             : OFF
  BUILD_MOBILE_AUTOGRAD : OFF
  INTERN_BUILD_MOBILE   : 
  USE_BLAS              : 0
  USE_LAPACK            : 0
  USE_ASAN              : OFF
  USE_CPP_CODE_COVERAGE : OFF
  USE_CUDA              : OFF
  USE_ROCM              : OFF
  USE_EIGEN_FOR_BLAS    : ON
  USE_FBGEMM            : ON
    USE_FAKELOWP          : OFF
  USE_KINETO            : OFF
  USE_FFMPEG            : OFF
  USE_GFLAGS            : OFF
  USE_GLOG              : OFF
  USE_LEVELDB           : OFF
  USE_LITE_PROTO        : OFF
  USE_LMDB              : OFF
  USE_METAL             : OFF
  USE_PYTORCH_METAL     : OFF
  USE_FFTW              : OFF
  USE_MKL               : OFF
  USE_MKLDNN            : ON
  USE_NCCL              : OFF
  USE_NNPACK            : OFF
  USE_NUMPY             : OFF
  USE_OBSERVERS         : ON
  USE_OPENCL            : OFF
  USE_OPENCV            : OFF
  USE_OPENMP            : ON
  USE_TBB               : OFF
  USE_VULKAN            : OFF
  USE_PROF              : OFF
  USE_QNNPACK           : OFF
  USE_PYTORCH_QNNPACK   : OFF
  USE_REDIS             : OFF
  USE_ROCKSDB           : OFF
  USE_ZMQ               : OFF
  USE_DISTRIBUTED       : OFF
  USE_DEPLOY           : OFF
  Public Dependencies  : Threads::Threads
  Private Dependencies : pthreadpool;cpuinfo;XNNPACK;fbgemm;fp16;foxi_loader;fmt::fmt-header-only
Configuring done
Generating done

Thanks !!

Based on your install log MKL cannot be found:

MKL could not be found.  Defaulting to Eigen
[...]
USE_MKL               : OFF

so you might want to install it.

Hello, yes I’ve discovered in the meantime that MKL can be important :sweat_smile:
Thanks for the advice I will install it and try to build libtorch again.

I was able to build libtorch 1.8.1 with MKL and to get the same performance than from the downloaded version, Thanks.

However torch_cpu.dll now weighs 173 MB :confused:

This makes the deployment of a torch-based app quite heavy. Is it a good idea to fill a bug or an improvment request for this ? And isn’t there a way to select just what’s needed for inference only ?

Thanks

Based on your findings it seems that apparently MKL adds (173 - 68 = 105MB) to the application most likely due to shipping optimized code.
Sure, you could create an issue on GitHub and see, if the size of MKL could be reduced.