Data loader struct pack issue(overflow)

I have created a issue in pytorch repo also: https://github.com/pytorch/pytorch/issues/43467

I am using pytorch data loader and runs into some strange error as below, we generate training data in each hour. and I run this experiment to just read training data csv file and parse line by line, no real training stuff is going. we define the dataloader __get_item_ to get a single file out of bunch of files(>5000 files). and each worker dataloader is suppose to read that file and parse csv.

I checked it should not be __get_item index out of boundary, is it because some training file we loaded is too large? it is not always repeatable error, so I am not sure it may be related to some training data? is it possible one training file was too large and basically the multi worker data loader overflow?

Traceback (most recent call last):
  File "/home/miniconda/lib/python3.6/multiprocessing/queues.py", line 240, in _feed
    send_bytes(obj)
  File "/home/miniconda/lib/python3.6/multiprocessing/connection.py", line 200, in send_bytes
    self._send_bytes(m[offset:offset + size])
  File "/home/miniconda/lib/python3.6/multiprocessing/connection.py", line 393, in _send_bytes
    header = struct.pack("!i", n)
struct.error: 'i' format requires -2147483648 <= number <= 2147483647

looks like multi processing can not handle more than 2G data, anyway we can enlarge multi processing data handle size?