I tried running in a regular .py script from powershell, resulting in the same error, just with additional extension:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
I eventually solved my problem and i’ll leave the solution here so hopefully someone else will be spared the pain.
It had nothing to do with python version or interactive shells. I tried different environments, none made it work. The error was related to pickling/dictionaries/windows/python.
My pytorch data(torch.utils.data.Dataset) object was abstracting a classification dataset gained from an xml. The pipeline was .xml to data_dict = dict{‘classnames’: dict{‘example 1’: img_path, …} …} to pytorch dataset with list of all elements. In the pytorch data class I gathered all classnames to access as attribute by:
self.classes = data_dict.keys()
which caused the error because the data_dict.keys() was only a shallow copy of the pointer towards the keys listed in the class where I use ElementTree to extract the dict out of the .xml ! I could resolve the issue by assigning seperate memory:
self.classes = list(data_dict.keys())
Note, that dicts and odicts are not in general troublesome. assigning a dict as attribute did not cause the error.
can you explain more clearly,I am new in pytorch, and I meet the same questions with you, I have no idea where to add the sentence you have said"self.classes=list(data_dict.keys())",thank you
My problem was that my class assignment was merely a pointer to a pointer that pointed into a file, i.e. the values for my classes were directly read from the disk memory. That caused the pickling to fail. In python, often a variable assignment from another variable is only a new pointer to the same memory, i.e. no new memory is assigned. In my case I had to wrap the iterable dict.keys() in a list. While both return exactly the same values, the list() constructor assigns new memory in RAM.
It might be that you have a similar problem in your pipeline if you read from a csv, xml, json or whatever. Make sure that in your code at one point you make a deep copy of whatever values you read in so that the variables for pickling do not point into the hard disk memory space but in RAM.
excuse me , I meet the same questions with you , but because I’m new in python and pytorch, I really can’t understand how to solve it , can you help me?
self.train_data points into the .h5 file. Try wrapping it with a list constructor, or whatever data type is appropriate, to create a deep copy instead of a shallow reference:
In my case, error said ‘h5py object cannnot be pickled’. And I think the reason is that python will copy all member variable of class when use multiprocess, while h5py object cannot be copy for different process.
So if you set num_workers=0, it will avoid this error. And my solution is avoiding use h5py object as a member variable.
By the way, the true reason of using self.classes = list(data_dict.keys()) maybe that the type of data_dict.keys() in python3 is <class 'dict_keys'>, which cannot be pickled either.