Memory error when installing pytorch

Hello,

I’ve had trouble installing pytorch locally on a shared computing cluster.

When I tried to install pytorch in a python 3.6 virtualenv with pip3 I got the following error:

Exception:
Traceback (most recent call last):
File “/apps/python3/python3-3.6.1-ic-2017-mkl2/lib/python3.6/site-packages/pip/basecommand.py”, line 215, in main
status = self.run(options, args)
File “/apps/python3/python3-3.6.1-ic-2017-mkl2/lib/python3.6/site-packages/pip/commands/install.py”, line 335, in run
wb.build(autobuilding=True)
File “/apps/python3/python3-3.6.1-ic-2017-mkl2/lib/python3.6/site-packages/pip/wheel.py”, line 749, in build
self.requirement_set.prepare_files(self.finder)
File “/apps/python3/python3-3.6.1-ic-2017-mkl2/lib/python3.6/site-packages/pip/req/req_set.py”, line 380, in prepare_files
ignore_dependencies=self.ignore_dependencies))
File “/apps/python3/python3-3.6.1-ic-2017-mkl2/lib/python3.6/site-packages/pip/req/req_set.py”, line 620, in _prepare_file
session=self.session, hashes=hashes)
File “/apps/python3/python3-3.6.1-ic-2017-mkl2/lib/python3.6/site-packages/pip/download.py”, line 821, in unpack_url
hashes=hashes
File “/apps/python3/python3-3.6.1-ic-2017-mkl2/lib/python3.6/site-packages/pip/download.py”, line 663, in unpack_http_url
unpack_file(from_path, location, content_type, link)
File “/apps/python3/python3-3.6.1-ic-2017-mkl2/lib/python3.6/site-packages/pip/utils/init.py”, line 599, in unpack_file
flatten=not filename.endswith(’.whl’)
File “/apps/python3/python3-3.6.1-ic-2017-mkl2/lib/python3.6/site-packages/pip/utils/init.py”, line 488, in unzip_file
data = zip.read(name)
File “/apps/python3/python3-3.6.1-ic-2017-mkl2/lib/python3.6/zipfile.py”, line 1309, in read
return fp.read()
File “/apps/python3/python3-3.6.1-ic-2017-mkl2/lib/python3.6/zipfile.py”, line 833, in read
buf += self._read1(self.MAX_N)
File “/apps/python3/python3-3.6.1-ic-2017-mkl2/lib/python3.6/zipfile.py”, line 923, in _read1
data = self._decompressor.decompress(data, n)
MemoryError

I retried this using a compute node with 5.05 GB of memory and I got the same error.

I tried to install pytorch on conda with the command (conda install pytorch torchvision cuda80 -c soumith), but I got this error:

ERROR conda.core.link:_execute_actions(337): An error occurred while installing package ‘soumith::pytorch-0.2.0-py36h53baedd_4cu80’.
MemoryError()
Attempting to roll back.
MemoryError()

How can I install pytorch?

Thank you for your time,

Gary

1 Like

Test your install inside Anaconda Docker container first.

https://hub.docker.com/r/continuumio/anaconda3/

Does it work well there?

Hi,
I too have encountered a similar problem. Could you let me know how you resolved this issue if and when it gets done plz …

You probably don’t have enough RAM.

I was allocating memory incorrectly on the shared computation cluster, which is why I got the memory error.

1 Like

I solved the issue by adding swap and --no-cache-dir
(My ram’s size is 512mb)

first: create swap

# create swap file of 512 MB
dd if=/dev/zero of=/swapfile bs=1024 count=524288
# modify permissions
chown root:root /swapfile
chmod 0600 /swapfile
# setup swap area
mkswap /swapfile
# turn swap on
swapon /swapfile

second: install without cache dir

pip3  --no-cache-dir  install http://download.pytorch.org/whl/cu75/torch-0.2.0.post3-cp35-cp35m-manylinux1_x86_64.whl


3 Likes

yes i got the same error while installing pytorch on my VM with pip package and i allocate 1gb ram to VM…please suggest me how can i overcome it

In my case, just using the pip option --no-cache-dir was sufficient to make the install work.


About my config, I have 1Mo of RAM on my AWS EC2 debian server. Here are the outputs of lsb_release and free, after the installation of torch:

admin@server:~$ lsb_release -cds
Debian GNU/Linux 10 (buster)
buster
admin@server:~$ free -h
              total        used        free      shared  buff/cache   available
Mem:          987Mi       439Mi       246Mi        10Mi       301Mi       396Mi
Swap:            0B          0B          0B
1 Like

In my case, on Fedora 23 VPS, I used the given below hack:

mkdir /tmp2
export TMPDIR=/tmp2
python3.8 -m pip install --no-clean --no-cache-dir torch torchvision
1 Like