I ran into a problem when using the torch multiprocessing library. The error message is as follows:
I’ve changed the num_workers parameter in dataloader to 0, but it doesn’t solve it.
I have monitored the memory of the host machine and it is sufficient.
How can I solve it?
import torch
import torch.multiprocessing
torch.multiprocessing.set_sharing_strategy('file_system')#
Traceback (most recent call last):
File "run.py", line 53, in <module>
main()
File run.py", line 49, in main
slam.run()
File /NICE_SLAM.py", line 304, in run
p.start()
File anaconda3/envs/nice-slam/lib/python3.7/multiprocessing/process.py", line 112, in start
self._popen = self._Popen(self)
Fileanaconda3/envs/nice-slam/lib/python3.7/multiprocessing/context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
Fileanaconda3/envs/nice-slam/lib/python3.7/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File anaconda3/envs/nice-slam/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File anaconda3/envs/nice-slam/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__
self._launch(process_obj)
File anaconda3/envs/nice-slam/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File anaconda3/envs/nice-slam/lib/python3.7/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
File anaconda3/envs/nice-slam/lib/python3.7/site-packages/torch/multiprocessing/reductions.py", line 338, in reduce_storage
metadata = storage._share_filename_()
RuntimeError: unable to mmap 128 bytes from file </torch_31439_2312018605_63322>: Cannot allocate memory (12)
[W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]
Process finished with exit code 1