Installing PyTtorch on ARM Cortex A9

manoharvhr · September 28, 2018, 7:18am

Hi All,

I am trying to install PyTorch on an ARM CORTEX A9 (32-bit, ARMv7-A Architecture). However, during installation, I face an out of memory issue, stating that the virtual memory is exhausted.

The steps I followed for installation are:

Update packages: sudo apt-get update && sudo apt-get upgrade
Create SWAP partition:
2.1) sudo dd if=/dev/zero of=/swapfile bs=1M count=4000 (I have built-in 512Mb RAM)
2.2) sudo mkswap swapfile
2.3) sudo swapon swapfile
2.4) sudo nano etc/fstab
2.5) Added this line: /swapfile none swap sw 0 0
sudo apt-get install libopenblas-dev cython3 libatlas-dev m4 libblas-base-dev cmake
pip3 install --user pyyaml numpy
git clone --recursive https://github.com/pytorch/pytorch
cd pytorch
git checkout tags/v0.4.0 -b build
git submodule update --init --recursive
export NO_CUDA=1
export NO_DISTRIBUTED=1
python3 setup.py build

During the build, process goes further than 100%, performing GCC builds, where I get (Highlights only):

gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/home/xilinx/pytorch -I/home/xilinx/pytorch/torch/csrc -I/home/xilinx/pytorch/third_party/pybind11/include -I/home/xilinx/pytorch/torch/lib/tmp_install/include -I/home/xilinx/pytorch/torch/lib/tmp_install/include/TH -I/home/xilinx/pytorch/torch/lib/tmp_install/include/THNN -I/home/xilinx/pytorch/torch/lib/tmp_install/include/ATen -I/root/.local/lib/python3.6/site-packages/numpy/core/include -I/usr/local/include/python3.6m -c torch/csrc/jit/generated/aten_dispatch.cpp -o build/temp.linux-armv7l-3.6/torch/csrc/jit/generated/aten_dispatch.o -D_THP_CORE -std=c++11 -Wall -Wextra -Wno-unused-parameter -Wno-missing-field-initializers -Wno-write-strings -Wno-zero-length-array -fno-strict-aliasing -Wno-missing-braces -DWITH_NUMPY
torch/csrc/jit/interned_strings.cpp: In constructor ‘torch::jit::InternedStrings::InternedStrings()’:
torch/csrc/jit/interned_strings.cpp:78:3: note: variable tracking size limit exceeded with -fvar-tracking-assignments, retrying without
InternedStrings()
^

cc1plus: out of memory allocating 2274304 bytes after a total of 30859264 bytes

In file included from /home/xilinx/pytorch/torch/csrc/THP.h:34:0,
from torch/csrc/autograd/python_function.cpp:10:
torch/csrc/autograd/python_function.cpp: In function ‘PyObject* THPFunction_do_backward(THPFunction*, PyObject*)’:
torch/csrc/autograd/python_function.cpp:822:55: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
THPUtils_assert(PyTuple_GET_SIZE(raw_grad_output) == self->cdata.num_inputs(),
^
/home/xilinx/pytorch/torch/csrc/utils.h:21:45: note: in definition of macro ‘THP_EXPECT’
#define THP_EXPECT(x, y) (__builtin_expect((x), (y)))
^
/home/xilinx/pytorch/torch/csrc/utils.h:119:36: note: in expansion of macro ‘THPUtils_assertRet’
#define THPUtils_assert(cond, …) THPUtils_assertRet(NULL, cond, VA_ARGS)
^
torch/csrc/autograd/python_function.cpp:822:5: note: in expansion of macro ‘THPUtils_assert’
THPUtils_assert(PyTuple_GET_SIZE(raw_grad_output) == self->cdata.num_inputs(),
^
At global scope:
cc1plus: warning: unrecognized command line option ‘-Wno-zero-length-array’

virtual memory exhausted: Cannot allocate memory

error: command ‘gcc’ failed with exit status 1

Note that I have Python 3.6.5 and GCC 5.3.0

albanD · September 28, 2018, 9:07am

Hi,

Would it be possible to try with a 10GB swap space?
From what I remember the compilation could be quite memory angry like ~5GB. Not sure if it has been improved since.

manoharvhr · September 29, 2018, 6:57am

I shall try and keep you posted.

manoharvhr · October 3, 2018, 9:35am

After assigning 10GB of swap, I get the following error.

error: command ‘arm-linux-gnueabihf-gcc’ failed with exit status 4

manoharvhr · October 9, 2018, 5:52pm

After changing the swappiness to 10 and trying to install the latest PyTorch I get the following error:

/usr/include/c++/7/bits/stl_vector.h: In member function â€˜void caffe2::ConvPoolOpBase<Context>::SetOutputSize(const caffe2::Tensor&, caffe2::Tensor*, int) [with Context = caffe2::CPUContext]â€™: /usr/include/c++/7/bits/stl_vector.h:1369:17: note: parameter passing for argument of type â€˜__gnu_cxx::__normal_iterator<long long int*, std::vector<long long int> >â€™ changed in GCC 7.1 { _M_assign_aux(__first, __last, std::__iterator_category(__first)); } ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [ 75%] Building CXX object caffe2/CMakeFiles/caffe2.dir/transforms/conv_to_nnpack_transform.cc.o [ 75%] Building CXX object caffe2/CMakeFiles/caffe2.dir/transforms/pattern_net_transform.cc.o [ 75%] Building CXX object caffe2/CMakeFiles/caffe2.dir/transforms/single_op_transform.cc.o [ 75%] Linking CXX shared library …/lib/libcaffe2.so [ 75%] Built target caffe2 Scanning dependencies of target pattern_net_transform_test [ 76%] Building CXX object caffe2/CMakeFiles/pattern_net_transform_test.dir/transforms/pattern_net_transform_test.cc.o Scanning dependencies of target caffe2_pybind11_state [ 76%] Building CXX object caffe2/CMakeFiles/caffe2_pybind11_state.dir/python/pybind_state.cc.o [ 76%] Linking CXX executable …/bin/pattern_net_transform_test /home/xilinx/pytorch/build/lib/libcaffe2.so: undefined reference to `caffe2::detail::TypeMetaData const* caffe2::TypeMeta::_typeMetaDataInstance<long>()’ collect2: error: ld returned 1 exit status caffe2/CMakeFiles/pattern_net_transform_test.dir/build.make:98: recipe for target ‘bin/pattern_net_transform_test’ failed make[2]: *** [bin/pattern_net_transform_test] Error 1 CMakeFiles/Makefile2:1620: recipe for target ‘caffe2/CMakeFiles/pattern_net_transform_test.dir/all’ failed make[1]: *** [caffe2/CMakeFiles/pattern_net_transform_test.dir/all] Error 2 make[1]: *** Waiting for unfinished jobs… [ 76%] Building CXX object caffe2/CMakeFiles/caffe2_pybind11_state.dir/python/pybind_state_dlpack.cc.o [ 76%] Building CXX object caffe2/CMakeFiles/caffe2_pybind11_state.dir/python/pybind_state_nomni.cc.o [ 76%] Building CXX object caffe2/CMakeFiles/caffe2_pybind11_state.dir/python/pybind_state_registry.cc.o [ 76%] Linking CXX shared module python/caffe2_pybind11_state.cpython-36m-arm-linux-gnueabihf.so [ 76%] Built target caffe2_pybind11_state Makefile:140: recipe for target ‘all’ failed make: *** [all] Error 2 setup.py::build_deps::run() Failed to run ‘bash …/tools/build_pytorch_libs.sh --use-nnpack caffe2 libshm’

Dan_Erez · April 15, 2019, 11:13am

This is likely no longer relevant to the original poster but for others - you can “export MAX_JOBS=2”. That will reduce the number of workers. It’s a lot simpler than swap memory…

xifanlover · July 17, 2020, 7:28am

hi man，do you succeed install PyTorch on an ARM CORTEX A9 ( 32-bit , ARMv7-A Architecture)？I meet the problem too。I also have this demand。

manoharvhr · July 17, 2020, 7:41am

Hi, you can have a look at my github page here. I provide a working solution.