Pytorch Build "almost" succeeds but fails | Undefined symbols for architecture x86_64 __cvtu32_mask16

I am also getting this from a clean git clone. I am using MacOS (Mojave 10.14.6) and I just updated xcode to the latest version.

I did a fresh git clone of pytorch repostiory and started following the README directions. This is what I got:

Undefined symbols for architecture x86_64: "__cvtu32_mask16", referenced from: _xnn_f32_clamp_ukernel__avx512f in libXNNPACK.a(avx512f.c.o) _xnn_f32_dwconv_ukernel_up16x25__avx512f in libXNNPACK.a(up16x25-avx512f.c.o) _xnn_f32_dwconv_ukernel_up16x4__avx512f in libXNNPACK.a(up16x4-avx512f.c.o) _xnn_f32_dwconv_ukernel_up16x9__avx512f in libXNNPACK.a(up16x9-avx512f.c.o)

Here’s the exact order of what I did following the fresh git clone.

conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi
	# Omitted `typing` because I’m on Python 3.7
git submodule sync
git submodule update --init —recursive
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../“}
USE_CUDA=0 MACOSX_DEPLOYMENT_TARGET=10.14 CC=clang CXX=clang++ python setup.py install

Clang

clang --version
Apple LLVM version 10.0.1 (clang-1001.0.46.4)
Target: x86_64-apple-darwin18.7.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

GCC (same as clang)

gcc --version
Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/c++/4.2.1
Apple LLVM version 10.0.1 (clang-1001.0.46.4)
Target: x86_64-apple-darwin18.7.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

Can you try disabling the xnnpack build by setting USE_XNNPACK=0 ?

Thanks for the reply!

That did not fix it.

I stumbled upon this documenation from Doxygen, which I’m aware Pytorch uses per CONTRIBUTING.md

You can see the _cvtu32_mask16 function is specified there.

This is definitely beyond my realm of expertise ><. I commented on an issue on the github project but I wasn’t sure if that was the right place. It appears someone else is getting the same issue and it happened very recently.

Could you share the link to the issue where you have commented. I have the same issue when building on macOS 10.13 but it looks fine on macOS 10.15.

I believe I have figured out why.

The documentation where you found _cvtu32_mask16 indicates that _cvtu32_mask16 is an llvm(clang) function. And the error message indicates it was XNNPACK that is linking this function.

So I dive into the XNNPACK source code and found this line: https://github.com/google/XNNPACK/blob/master/src/xnnpack/intrinsics-polyfill.h#L36

When I checkout pytorch tags/v1.5.0, the same line says Apple Clang pre-10 so the judgement is (__apple_build_version__ < 10000000). In this way, XNNPACK does not define its own _cvtu32_mask16 implementation but refer it to clang, tries to link it finally and failed because the clang installed on macOS 10.13/10.14 does not have _cvtu32_mask16 implemented in its libraries.

That also explains why macOS 10.15 works because the clang version on macOS 10.15 is 11.

1 Like

so can we compile clang 11 on macOSX 10.13.6, then use this for compilation?
From my point of view, this can be a simple approach to avoid side effect. Any other suggestion?
Because I really want to strive out on 10.13.6 with portable GPU.

@zw19906 @Stevers sorry for late response, I figured out how to build torch on macOS 10.13.6 + CUDA 10.1(update 2) + cudnn 7.6.5.
Just summarize in pytorch/pytorch#46803 and google/XNNPACK#1081 for your reference. But, honestly speaking, I thought XNNPACK might have more potential bugs for code generation template. Anyway make torch building for avx passing…