Memory Issue when loading data with multiprocessing

I am working with large dataset of videos. And I modified a torch.utils.data.Dataset which returens video path and send to my video decoder using opencv.
There is:
batch_proc = multiprocessing.Process(target=video_batcher, args=(queue, fv, batchsize))
which starts a new process for each video clip and calls join() after all batch of a video are processed. queue is multiprocessing.Queue object which stores a batch of converted frames, say, 128 images per batch.
fv is an object which maintains a buffer and read from video consistently using Threading.
I intended to concatenate every batch of feature map(after a pretrained CNN) and save as hdf5. but error reports like this for each loop:

Traceback (most recent call last):
File “/opt/conda/lib/python2.7/multiprocessing/queues.py”, line 268, in _feed
send(obj)
File “/opt/conda/lib/python2.7/site-packages/torch/multiprocessing/queue.py”, line 17, in send
ForkingPickler(buf, pickle.HIGHEST_PROTOCOL).dump(obj)
File “/opt/conda/lib/python2.7/pickle.py”, line 224, in dump
self.save(obj)
File “/opt/conda/lib/python2.7/pickle.py”, line 286, in save
f(self, obj) # Call unbound method with explicit self
File “/opt/conda/lib/python2.7/multiprocessing/forking.py”, line 67, in dispatcher
self.save_reduce(obj=obj, *rv)
File “/opt/conda/lib/python2.7/pickle.py”, line 401, in save_reduce
save(args)
File “/opt/conda/lib/python2.7/pickle.py”, line 286, in save
f(self, obj) # Call unbound method with explicit self
File “/opt/conda/lib/python2.7/pickle.py”, line 554, in save_tuple
save(element)
File “/opt/conda/lib/python2.7/pickle.py”, line 286, in save
f(self, obj) # Call unbound method with explicit self
File “/opt/conda/lib/python2.7/multiprocessing/forking.py”, line 66, in dispatcher
rv = reduce(obj)
File “/opt/conda/lib/python2.7/site-packages/torch/multiprocessing/reductions.py”, line 113, in reduce_storage
fd, size = storage.share_fd()
RuntimeError: $ Torch: unable to mmap memory: you tried to mmap 0GB. at /b/wheel/pytorch-src/torch/lib/TH/THAllocator.c:317

Is that allocator not working well with `multiprocessing.Queue``` ?

is this in a docker setting? if so, need to set the flag: --ipc=host when starting the docker session

It is. Thank you. Btw, If I want to run LSTM on a quite long sequence of data with relatively high dimension, could pytorch handle GPU memory well? Because to my understanding autograd records all action through forward and backward and pytorch process all data as batch. So I am afraid of out of memory error.

Hi I met the exactly same problem. Could you please tell me how did you solve this problem specifically? How can I set the flag: --ipc=host when starting the docker session.

For issue about docker session you can check https://docs.docker.com/engine/reference/run/#ipc-settings-ipc . And I managed to hold a branch of numpy.ndarray when loading data and preprocessing. Cuda Tensors are just converted to before sending to network.

I did not use docker but I have already solved the bug. Thanks for your reply!

Bot how did you solve exactly?

Bot how did you solve exactly?