Pytorch on Aarch64 -- Neon instructions

Hi everyone,

I have a question regarding native ARM NEON support for Pytorch on ARMv8 architectures.

(Tool-Solutions/docker/pytorch-aarch64 at main · ARM-software/Tool-Solutions · GitHub)

I can see in the pytorch docker build that OpenBLAS is the default backend for AArch64, and on the OpenBLAS website it seems that ARM Neon intrinsics are not available by default.

(GitHub - OpenMathLib/OpenBLAS: OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.)

My question is, are ARM NEON intrinsics enabled by default for floating point inference on the latest Pytorch wheels? Is there an easy way to check if my inference is currently using NEON instructions?

If NEON instructions are not enabled by default? Is there any latest pytorch wheels which do have NEON support? Or are there any build instructions which I can access that can use to build pytorch with the Arm Compute Library enabled?

Thanks everyone so much. I couldn’t really find any helpful documentation for this.

P.S. I also have tried to build Pytorch with ACL support myself but I keep running into issues with linking ACL …

Hello,
The docker builds for pytorch have two backends for blas: either onednn with support for ACL or Openblas without ACL. You can find both in : https://hub.docker.com/r/armswdev/pytorch-arm-neoverse
Docker pull the correct version is required then.
P.S: inference on ARM currently supports FP32 AFAIK, but int8 inference doesn’t seem to be fully supported…

Hi, our optimised backed is based on ACL. Could you build with the oneDNN + ACL backend? Alternatively our upstream pip wheels should come with ACL. If you pass ONEDNN_VERBOSE=profile_exec,profile_externals, you should be able to see onednn verbose logs calling into ACL kernels.

Regarding Arm NEON intrinsics, we should have that enabled for eager mode : pytorch/aten/src/ATen/cpu/vec/vec256 at main · pytorch/pytorch · GitHub

We are also going enable SLEEF via torch compile as well. How are you running your code? Eager mode or Torch compile?

Regarding the issue you are facing building with ACL. Have you set the ACL_ROOT_DIR ?

Can you raise an issue on our Tool-Solutions Github repo, if you are still facing this issue?

Thanks
Crefeda