Mmap memory error when use multiple CPU on Azure

Hi all,

I am using multiple CPUS to train my model on azure with MongoDB. It seems I need to open a connection to data in each of the threads. Then I got this error.

Traceback (most recent call last):
  File "", line 225, in <module>
  File "/home/textiq/anaconda/lib/python3.6/site-packages/torch/nn/modules/", line 468, in share_memory
    return self._apply(lambda t: t.share_memory_())
  File "/home/textiq/anaconda/lib/python3.6/site-packages/torch/nn/modules/", line 118, in _apply
  File "/home/textiq/anaconda/lib/python3.6/site-packages/torch/nn/modules/", line 124, in _apply = fn(
  File "/home/textiq/anaconda/lib/python3.6/site-packages/torch/nn/modules/", line 468, in <lambda>
    return self._apply(lambda t: t.share_memory_())
  File "/home/textiq/anaconda/lib/python3.6/site-packages/torch/", line 86, in share_memory_
  File "/home/textiq/anaconda/lib/python3.6/site-packages/torch/", line 101, in share_memory_
RuntimeError: $ Torch: unable to mmap memory: you tried to mmap 0GB. at /py/conda-bld/pytorch_1493681908901/work/torch/lib/TH/THAllocator.c:317`

Could some one tell me what to do to solve this problem?

Thanks in advance.

1 Like

I am using Ubuntu 16.04, pytorch, linux 4.4.0-81-generic, python 3.6

this is weird. I wonder if Azure is somehow limiting the shared memory available to your process. Are you running docker inside azure?
Also, what’s the output of:

ipcs -lm

Thanks for your reply. I just figured out what happened. I didn’t use docker inside azure. The problem is I mistakenly initialized a nn.embedding in the model with size of 0. (for example nn.Embedding(0,300)). Then, I will generate this error when model.share_memory(). Now, I fixed it.

1 Like

thanks for figuring this out. we’ll improve the error message in this situation, you can track it

Really appreciate your prompt reply!