A c++ preprocessor class is implemented with pybind11. The c++ class is imported in a custominzed dataset(torch.utils.data.Dataset) and one of its funtion is called by getitem. Once I tried to use DDP for multi-process training, the errror “cannot pickle preprocessor object” occurs. It works correctly, if i use one gpu without DDP(num_worker>2, num_batch>2).
It seems ddp need to pickle everything contained by the dataset class for sharing the dataloader between different process (link).
I also checked the official pybind11 doc about pickling support: link
But there is no cues about how to pickle a general c++ classes with pointers (fill numpy arrays).
Am I using ddp wrongly? or is there any solution can solve it?
Hard to say without seeing reproducible example. But I would guess that you have custom pybind objects inside of DataSet structures. Try defining __setstate__ and __getstate__ function for them. Also in most cases (if it is not shared memory), passing memory pointers between processes will not work and you need to serialize data.