Cannot iterate through dataloader

I am using a pretrained model for text recognition. I developed the entire project in ubuntu based system. Last week, I switched to Mac M1. I am using the same code, same versions of libraries/packages but I am not able to make it through,.

The problem lies with the dataloader particularly when I iterate through the dataloader object. I am able to use the same code in ubuntu based system with x86 architecture but not here.

Below is the error that I am getting. Any help would be appreciated.

ERROR:root:2021-07-11_12-18-14 exception caused - cannot pickle 'module' object
Traceback (most recent call last):
  File "/Users/dhruv/PycharmProjects/ocr/src/recognition/recognition.py", line 154, in recog_only
    recog_model.recognition_output(input_dir=os.path.join(input_path, folder))
  File "/Users/dhruv/PycharmProjects/ocr/src/recognition/recognition.py", line 70, in recognition_output
    for image_tensors, image_path_list in self.demo_loader:
  File "/Users/dhruv/miniforge3/envs/ocr-py38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 355, in __iter__
    return self._get_iterator()
  File "/Users/dhruv/miniforge3/envs/ocr-py38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 301, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
  File "/Users/dhruv/miniforge3/envs/ocr-py38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 914, in __init__
    w.start()
  File "/Users/dhruv/miniforge3/envs/ocr-py38/lib/python3.8/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/Users/dhruv/miniforge3/envs/ocr-py38/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/Users/dhruv/miniforge3/envs/ocr-py38/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
    return Popen(process_obj)
  File "/Users/dhruv/miniforge3/envs/ocr-py38/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/Users/dhruv/miniforge3/envs/ocr-py38/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/Users/dhruv/miniforge3/envs/ocr-py38/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/Users/dhruv/miniforge3/envs/ocr-py38/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle 'module' object

Thanks,
Dhruv

Could you add the if-clause protection via:


def main()
    for i, data in enumerate(dataloader):
        # do something here

if __name__ == '__main__':
    main()

and rerun your code?

Thank you for your reply!

Currently I have this function inside a class Recognition:

    def recognition_output(self, input_dir):
        try:
            self.prepare_data(input_dir=input_dir)
        except Exception as e:
            logging.info(f"{datetime.now().strftime('%Y-%m-%d_%H-%M-%S')} failed to prepare data")
            logging.exception(f"{datetime.now().strftime('%Y-%m-%d_%H-%M-%S')} exception caused - {e}")

        with torch.no_grad():
            for image_tensors, image_path_list in self.demo_loader:
                # do something

First I am creating an object of Recognition class and then calling this function from another file.

Can you please tell me where I am supposed (in which file) to put the if-clause protection statement?

Thank you!

Put it in the main script file you are executing from the terminal via e.g. python script.py args (put it into script.py).

Still same error that says :frowning:

cannot pickle 'module' object

Unfortunately, I don’t have a Mac M1 and since the code seems to run fine on a Linux system I won’t be able to reproduce it, so we would have to wait for M1 users.

So I guess this is going to haunt me for quite a while. I will keep trying and if I make any progress, I will update this thread.

Thanks for your time and help!

One quick question:

Should I raise the issue on github repo?

Yes, please create an issue on GitHub so that we can track and fix it.

Okay, I will.

Thank you!