Pybind dataloader pickle

Hello everyone,
Suggested by @ rvarm1, I change the topic category (distribute–>dataloader) and re-post it here. Followings is the original problem:

A c++ preprocessor class is implemented with pybind11. The c++ class is imported in a custominzed dataset(torch.utils.data.Dataset) and one of its funtion is called by getitem. Once I tried to use DDP for multi-GPU training, the errror “cannot pickle preprocessor object” occurs. It works correctly, if i use one gpu without DDP(num_worker>2, num_batch>2).

It seems ddp/dataloader need to pickle everything contained by the dataset class for sharing the dataloader between different processes (link).

I also checked the official pybind11 doc about pickling support
But there is no cues about how to pickle a general c++ classes with pointers (fill numpy arrays).

Am I using ddp wrongly? or is there any solution can solve it?

Thanks,
Lin