Hi, we have enabled the multi worker data loader to load 10K+ training data files, the speed is pretty good with multiple workers, however, we also try to leverage the capability of multi worker to not only read data line by line, but parsing the line to JSON Dict, here we have problem
ERROR: Unexpected segmentation fault encountered in worker.
Traceback (most recent call last):
File "/home/miniconda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 480, in _try_get_batch
data = self.data_queue.get(timeout=timeout)
File "/home/miniconda/lib/python3.6/multiprocessing/queues.py", line 104, in get
if not self._poll(timeout):
File "/home/miniconda/lib/python3.6/multiprocessing/connection.py", line 257, in poll
return self._poll(timeout)
File "/home/miniconda/lib/python3.6/multiprocessing/connection.py", line 414, in _poll
r = wait([self], timeout)
File "/home/miniconda/lib/python3.6/multiprocessing/connection.py", line 911, in wait
ready = selector.select(timeout)
File "/home/miniconda/lib/python3.6/selectors.py", line 376, in select
fd_event_list = self._poll.poll(timeout)
File "/home/miniconda/lib/python3.6/site-packages/torch/utils/data/_utils/signal_handling.py", line 65, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 95106) is killed by signal: Segmentation fault.
searched around and it seems point to shared memory problem, However, I think our system has sufficient shared memory.
another interesting point is : when I just parse the data line by line, I do not have this issue:
with open(current_file, mode='rb') as f:
text = f.read().decode('utf-8')
all_data.extend(text.split('\n'))
but if I add a JSON parse logic after read line by line , it will report this error
with open(current_file, mode='rb') as f:
text = f.read().decode('utf-8')
all_data.extend(text.split('\n'))
json_data = []
for line in all_data:
try:
json_data.append(json.loads(line))
except:
break
return json_data
so I wonder whether this is indeed shared memory issue? it is very rough error message. Also I wonder what is really shared memory used in multi worker data loader case, should we have multiple worker/process run independently and provide data to GPU finally? I am not seeing much need to share large chunk of memory besides some worker coordination?