DataLoader Multiprocessing error: can't pickle odict_keys objects when num_workers > 0

I eventually solved my problem and i’ll leave the solution here so hopefully someone else will be spared the pain.

It had nothing to do with python version or interactive shells. I tried different environments, none made it work. The error was related to pickling/dictionaries/windows/python.

My pytorch data(torch.utils.data.Dataset) object was abstracting a classification dataset gained from an xml. The pipeline was .xml to data_dict = dict{‘classnames’: dict{‘example 1’: img_path, …} …} to pytorch dataset with list of all elements. In the pytorch data class I gathered all classnames to access as attribute by:

self.classes = data_dict.keys()

which caused the error because the data_dict.keys() was only a shallow copy of the pointer towards the keys listed in the class where I use ElementTree to extract the dict out of the .xml ! I could resolve the issue by assigning seperate memory:

self.classes = list(data_dict.keys())

Note, that dicts and odicts are not in general troublesome. assigning a dict as attribute did not cause the error.

5 Likes