I am also getting this from a clean git clone. I am using MacOS (Mojave 10.14.6) and I just updated xcode to the latest version.
I did a fresh git clone of pytorch repostiory and started following the README directions. This is what I got:
Undefined symbols for architecture x86_64: "__cvtu32_mask16", referenced from: _xnn_f32_clamp_ukernel__avx512f in libXNNPACK.a(avx512f.c.o) _xnn_f32_dwconv_ukernel_up16x25__avx512f in libXNNPACK.a(up16x25-avx512f.c.o) _xnn_f32_dwconv_ukernel_up16x4__avx512f in libXNNPACK.a(up16x4-avx512f.c.o) _xnn_f32_dwconv_ukernel_up16x9__avx512f in libXNNPACK.a(up16x9-avx512f.c.o)
Here’s the exact order of what I did following the fresh git clone.
I stumbled upon this documenation from Doxygen, which I’m aware Pytorch uses per CONTRIBUTING.md
You can see the _cvtu32_mask16 function is specified there.
This is definitely beyond my realm of expertise ><. I commented on an issue on the github project but I wasn’t sure if that was the right place. It appears someone else is getting the same issue and it happened very recently.
The documentation where you found _cvtu32_mask16 indicates that _cvtu32_mask16 is an llvm(clang) function. And the error message indicates it was XNNPACK that is linking this function.
When I checkout pytorch tags/v1.5.0, the same line says Apple Clang pre-10 so the judgement is (__apple_build_version__ < 10000000). In this way, XNNPACK does not define its own _cvtu32_mask16 implementation but refer it to clang, tries to link it finally and failed because the clang installed on macOS 10.13/10.14 does not have _cvtu32_mask16 implemented in its libraries.
That also explains why macOS 10.15 works because the clang version on macOS 10.15 is 11.
so can we compile clang 11 on macOSX 10.13.6, then use this for compilation?
From my point of view, this can be a simple approach to avoid side effect. Any other suggestion?
Because I really want to strive out on 10.13.6 with portable GPU.
@zw19906@Stevers sorry for late response, I figured out how to build torch on macOS 10.13.6 + CUDA 10.1(update 2) + cudnn 7.6.5.
Just summarize in pytorch/pytorch#46803 and google/XNNPACK#1081 for your reference. But, honestly speaking, I thought XNNPACK might have more potential bugs for code generation template. Anyway make torch building for avx passing…