Installing PyTtorch on ARM Cortex A9

Hi All,

I am trying to install PyTorch on an ARM CORTEX A9 (32-bit, ARMv7-A Architecture). However, during installation, I face an out of memory issue, stating that the virtual memory is exhausted.

The steps I followed for installation are:

  1. Update packages: sudo apt-get update && sudo apt-get upgrade
  2. Create SWAP partition:
    2.1) sudo dd if=/dev/zero of=/swapfile bs=1M count=4000 (I have built-in 512Mb RAM)
    2.2) sudo mkswap swapfile
    2.3) sudo swapon swapfile
    2.4) sudo nano etc/fstab
    2.5) Added this line: /swapfile none swap sw 0 0
  3. sudo apt-get install libopenblas-dev cython3 libatlas-dev m4 libblas-base-dev cmake
  4. pip3 install --user pyyaml numpy
  5. git clone --recursive https://github.com/pytorch/pytorch
  6. cd pytorch
  7. git checkout tags/v0.4.0 -b build
  8. git submodule update --init --recursive
  9. export NO_CUDA=1
  10. export NO_DISTRIBUTED=1
  11. python3 setup.py build

During the build, process goes further than 100%, performing GCC builds, where I get (Highlights only):

gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/home/xilinx/pytorch -I/home/xilinx/pytorch/torch/csrc -I/home/xilinx/pytorch/third_party/pybind11/include -I/home/xilinx/pytorch/torch/lib/tmp_install/include -I/home/xilinx/pytorch/torch/lib/tmp_install/include/TH -I/home/xilinx/pytorch/torch/lib/tmp_install/include/THNN -I/home/xilinx/pytorch/torch/lib/tmp_install/include/ATen -I/root/.local/lib/python3.6/site-packages/numpy/core/include -I/usr/local/include/python3.6m -c torch/csrc/jit/generated/aten_dispatch.cpp -o build/temp.linux-armv7l-3.6/torch/csrc/jit/generated/aten_dispatch.o -D_THP_CORE -std=c++11 -Wall -Wextra -Wno-unused-parameter -Wno-missing-field-initializers -Wno-write-strings -Wno-zero-length-array -fno-strict-aliasing -Wno-missing-braces -DWITH_NUMPY
torch/csrc/jit/interned_strings.cpp: In constructor ‘torch::jit::InternedStrings::InternedStrings()’:
torch/csrc/jit/interned_strings.cpp:78:3: note: variable tracking size limit exceeded with -fvar-tracking-assignments, retrying without
InternedStrings()
^

cc1plus: out of memory allocating 2274304 bytes after a total of 30859264 bytes

In file included from /home/xilinx/pytorch/torch/csrc/THP.h:34:0,
from torch/csrc/autograd/python_function.cpp:10:
torch/csrc/autograd/python_function.cpp: In function ‘PyObject* THPFunction_do_backward(THPFunction*, PyObject*)’:
torch/csrc/autograd/python_function.cpp:822:55: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
THPUtils_assert(PyTuple_GET_SIZE(raw_grad_output) == self->cdata.num_inputs(),
^
/home/xilinx/pytorch/torch/csrc/utils.h:21:45: note: in definition of macro ‘THP_EXPECT’
#define THP_EXPECT(x, y) (__builtin_expect((x), (y)))
^
/home/xilinx/pytorch/torch/csrc/utils.h:119:36: note: in expansion of macro ‘THPUtils_assertRet’
#define THPUtils_assert(cond, …) THPUtils_assertRet(NULL, cond, VA_ARGS)
^
torch/csrc/autograd/python_function.cpp:822:5: note: in expansion of macro ‘THPUtils_assert’
THPUtils_assert(PyTuple_GET_SIZE(raw_grad_output) == self->cdata.num_inputs(),
^
At global scope:
cc1plus: warning: unrecognized command line option ‘-Wno-zero-length-array’

virtual memory exhausted: Cannot allocate memory

error: command ‘gcc’ failed with exit status 1

Note that I have Python 3.6.5 and GCC 5.3.0

Hi,

Would it be possible to try with a 10GB swap space?
From what I remember the compilation could be quite memory angry like ~5GB. Not sure if it has been improved since.

I shall try and keep you posted.

After assigning 10GB of swap, I get the following error.

error: command ‘arm-linux-gnueabihf-gcc’ failed with exit status 4

After changing the swappiness to 10 and trying to install the latest PyTorch I get the following error:

/usr/include/c++/7/bits/stl_vector.h: In member function ‘void caffe2::ConvPoolOpBase<Context>::SetOutputSize(const caffe2::Tensor&, caffe2::Tensor*, int) [with Context = caffe2::CPUContext]’: /usr/include/c++/7/bits/stl_vector.h:1369:17: note: parameter passing for argument of type ‘__gnu_cxx::__normal_iterator<long long int*, std::vector<long long int> >’ changed in GCC 7.1 { _M_assign_aux(__first, __last, std::__iterator_category(__first)); } ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [ 75%] Building CXX object caffe2/CMakeFiles/caffe2.dir/transforms/conv_to_nnpack_transform.cc.o [ 75%] Building CXX object caffe2/CMakeFiles/caffe2.dir/transforms/pattern_net_transform.cc.o [ 75%] Building CXX object caffe2/CMakeFiles/caffe2.dir/transforms/single_op_transform.cc.o [ 75%] Linking CXX shared library …/lib/libcaffe2.so [ 75%] Built target caffe2 Scanning dependencies of target pattern_net_transform_test [ 76%] Building CXX object caffe2/CMakeFiles/pattern_net_transform_test.dir/transforms/pattern_net_transform_test.cc.o Scanning dependencies of target caffe2_pybind11_state [ 76%] Building CXX object caffe2/CMakeFiles/caffe2_pybind11_state.dir/python/pybind_state.cc.o [ 76%] Linking CXX executable …/bin/pattern_net_transform_test /home/xilinx/pytorch/build/lib/libcaffe2.so: undefined reference to `caffe2::detail::TypeMetaData const* caffe2::TypeMeta::_typeMetaDataInstance<long>()’ collect2: error: ld returned 1 exit status caffe2/CMakeFiles/pattern_net_transform_test.dir/build.make:98: recipe for target ‘bin/pattern_net_transform_test’ failed make[2]: *** [bin/pattern_net_transform_test] Error 1 CMakeFiles/Makefile2:1620: recipe for target ‘caffe2/CMakeFiles/pattern_net_transform_test.dir/all’ failed make[1]: *** [caffe2/CMakeFiles/pattern_net_transform_test.dir/all] Error 2 make[1]: *** Waiting for unfinished jobs… [ 76%] Building CXX object caffe2/CMakeFiles/caffe2_pybind11_state.dir/python/pybind_state_dlpack.cc.o [ 76%] Building CXX object caffe2/CMakeFiles/caffe2_pybind11_state.dir/python/pybind_state_nomni.cc.o [ 76%] Building CXX object caffe2/CMakeFiles/caffe2_pybind11_state.dir/python/pybind_state_registry.cc.o [ 76%] Linking CXX shared module python/caffe2_pybind11_state.cpython-36m-arm-linux-gnueabihf.so [ 76%] Built target caffe2_pybind11_state Makefile:140: recipe for target ‘all’ failed make: *** [all] Error 2 setup.py::build_deps::run() Failed to run ‘bash …/tools/build_pytorch_libs.sh --use-nnpack caffe2 libshm’

This is likely no longer relevant to the original poster but for others - you can “export MAX_JOBS=2”. That will reduce the number of workers. It’s a lot simpler than swap memory…

hi man,do you succeed install PyTorch on an ARM CORTEX A9 ( 32-bit , ARMv7-A Architecture)?I meet the problem too。I also have this demand。

Hi, you can have a look at my github page here. I provide a working solution.