How is mkl bundled with PyTorch in pypi wheels?

Running torch.backends.mkl.is_available() against pypi installed versions of PyTorch returns True. However I cannot see where the mkl runtime is (.so files) are actually contained within the wheel, could someone let me know how PyTorch bundles mkl into its pip installable wheels?

Yes, I observed this as well with torch 1.9.1.

Running print(torch.__config__.parallel_info()) seems to indicate that MKL is enabled.

ATen/Parallel:
        at::get_num_threads() : 4
        at::get_num_interop_threads() : 4
OpenMP 201511 (a.k.a. OpenMP 4.5)
        omp_get_max_threads() : 4
Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
        mkl_get_max_threads() : 4
Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
std::thread::hardware_concurrency() : 8
Environment variables:
        OMP_NUM_THREADS : [not set]
        MKL_NUM_THREADS : [not set]
ATen parallel backend: OpenMP

However, if I run ldd libtorch_python.so | grep "libmkl*", I get no matches.

ldd prints shared libraries, while MKL is statically linked, if I’m not mistaken, so use nm -gDC.

Thanks. nm -gDC libtorch_python.so | grep mkl results in only mkl-dnn symbols being found.

0000000000a33d80 T THPVariable_is_mkldnn(THPVariable*, void*)
                 U at::_baddbmm_mkl_(at::Tensor&, at::Tensor const&, at::Tensor const&, c10::Scalar const&, c10::Scalar const&)
                 U at::mkldnn_linear(at::Tensor const&, at::Tensor const&, c10::optional<at::Tensor> const&)
                 U at::_mkldnn_reshape(at::Tensor const&, c10::ArrayRef<long>)
                 U at::_mkldnn_transpose(at::Tensor const&, long, long)
                 U at::mkldnn_max_pool2d(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool)
                 U at::mkldnn_max_pool3d(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool)
                 U at::_mkldnn_transpose_(at::Tensor&, long, long)
                 U at::mkldnn_convolution(at::Tensor const&, at::Tensor const&, c10::optional<at::Tensor> const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, long)
                 U at::mkldnn_adaptive_avg_pool2d(at::Tensor const&, c10::ArrayRef<long>)
                 U at::mkldnn_reorder_conv2d_weight(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, long)
                 U at::mkldnn_reorder_conv3d_weight(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, long)
                 U at::mkldnn_linear_backward_weights(at::Tensor const&, at::Tensor const&, at::Tensor const&, bool)
                 U at::mkldnn_convolution_backward_weights(c10::ArrayRef<long>, at::Tensor const&, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, long, bool)
                 U at::Tensor::to_mkldnn(c10::optional<c10::ScalarType>) const

Do you know where the static linking of MKL happens?