Memory Issue when loading data with multiprocessing

Varg_Nord · May 15, 2017, 9:40am

I am working with large dataset of videos. And I modified a torch.utils.data.Dataset which returens video path and send to my video decoder using opencv.
There is:
batch_proc = multiprocessing.Process(target=video_batcher, args=(queue, fv, batchsize))
which starts a new process for each video clip and calls join() after all batch of a video are processed. queue is multiprocessing.Queue object which stores a batch of converted frames, say, 128 images per batch.
fv is an object which maintains a buffer and read from video consistently using Threading.
I intended to concatenate every batch of feature map(after a pretrained CNN) and save as hdf5. but error reports like this for each loop:

Traceback (most recent call last):
File “/opt/conda/lib/python2.7/multiprocessing/queues.py”, line 268, in _feed
send(obj)
File “/opt/conda/lib/python2.7/site-packages/torch/multiprocessing/queue.py”, line 17, in send
ForkingPickler(buf, pickle.HIGHEST_PROTOCOL).dump(obj)
File “/opt/conda/lib/python2.7/pickle.py”, line 224, in dump
self.save(obj)
File “/opt/conda/lib/python2.7/pickle.py”, line 286, in save
f(self, obj) # Call unbound method with explicit self
File “/opt/conda/lib/python2.7/multiprocessing/forking.py”, line 67, in dispatcher
self.save_reduce(obj=obj, *rv)
File “/opt/conda/lib/python2.7/pickle.py”, line 401, in save_reduce
save(args)
File “/opt/conda/lib/python2.7/pickle.py”, line 286, in save
f(self, obj) # Call unbound method with explicit self
File “/opt/conda/lib/python2.7/pickle.py”, line 554, in save_tuple
save(element)
File “/opt/conda/lib/python2.7/pickle.py”, line 286, in save
f(self, obj) # Call unbound method with explicit self
File “/opt/conda/lib/python2.7/multiprocessing/forking.py”, line 66, in dispatcher
rv = reduce(obj)
File “/opt/conda/lib/python2.7/site-packages/torch/multiprocessing/reductions.py”, line 113, in reduce_storage
fd, size = storage.share_fd()
RuntimeError: $ Torch: unable to mmap memory: you tried to mmap 0GB. at /b/wheel/pytorch-src/torch/lib/TH/THAllocator.c:317

Is that allocator not working well with `multiprocessing.Queue``` ?

smth · May 20, 2017, 7:49pm

is this in a docker setting? if so, need to set the flag: --ipc=host when starting the docker session

Varg_Nord · May 25, 2017, 2:58am

It is. Thank you. Btw, If I want to run LSTM on a quite long sequence of data with relatively high dimension, could pytorch handle GPU memory well? Because to my understanding autograd records all action through forward and backward and pytorch process all data as batch. So I am afraid of out of memory error.

Zizhuo · June 19, 2017, 9:27pm

Hi I met the exactly same problem. Could you please tell me how did you solve this problem specifically? How can I set the flag: --ipc=host when starting the docker session.

Varg_Nord · June 22, 2017, 2:59am

For issue about docker session you can check https://docs.docker.com/engine/reference/run/#ipc-settings-ipc . And I managed to hold a branch of numpy.ndarray when loading data and preprocessing. Cuda Tensors are just converted to before sending to network.

Zizhuo · June 22, 2017, 3:41pm

I did not use docker but I have already solved the bug. Thanks for your reply!

botcs · June 27, 2017, 9:41am

Bot how did you solve exactly?

FinlayLiu · July 5, 2017, 7:50am

Bot how did you solve exactly?