Using Custom BLAS/LAPACK for Pytorch C++ Frontend

talnish · August 3, 2020, 5:50pm

I am using Pytroch-C++ front-end on an AMD Threadripper machine.

I built libtorch by following the instructions listed here:
https://pytorch.org/cppdocs/installing.html

I want to use an optimized BLAS/LAPACK library for AMD platform (https://developer.amd.com/amd-aocl/blas-library/) and wondering how can I link it to build libtorch with this library.

copythatpasta · August 3, 2020, 7:03pm

Did you actually build pytorch from source? or just used the prebuilt libtorch? It should be as simple as having that amd-blas library within your main directory for libs, and running the build process. Pytorch will find the BLAS package as long as it is there in the main usr libs.

Follow the build process here to build from source

talnish · August 5, 2020, 4:40pm

Thanks for your answer.
I was able to make the build system detect BLIS (AMD-BLAS) installation based on the following message that it prints while building from source:

-- Checking for [blis]
--   Library blis: /usr/local/lib/libblis.so
-- Looking for sgemm_
-- Looking for sgemm_ - found
-- Performing Test BLAS_F2C_DOUBLE_WORKS
-- Performing Test BLAS_F2C_DOUBLE_WORKS - Failed
-- Performing Test BLAS_F2C_FLOAT_WORKS
-- Performing Test BLAS_F2C_FLOAT_WORKS - Success
-- Performing Test BLAS_USE_CBLAS_DOT
-- Performing Test BLAS_USE_CBLAS_DOT - Failed
-- Found a library with BLAS API (FLAME).

However, when I use the following command

>>> import torch 
>>> print(*torch.__config__.show().split("\n"), sep="\n")

it prints this:

PyTorch built with:
  - GCC 7.5
  - C++ Version: 201402
  - Intel(R) MKL-DNN v1.5.0 (Git Hash e2ac1fac44c5078ca927cb9b90e1b3066a0b2ed0)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=OFF, USE_EIGEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

I am not sure how to interpret this, it shows MKL-DNN, BLAS=MKL (I was hoping to see BLAS=FLAME), and USE_MKL=OFF.

Am I missing something here? I am not sure if it’s still using BLIS or MKL.