How to share Dataset among different processes in Pytorch?

My dataset for training Pytorch model needs plenty of I/O to generate, so I hope the program will use only 1 process to generate it, and this is my code:

import torch.multiprocessing as mp

# world_size is GPUs' num
mp.spawn(train_worker, args=(world_size, config_data), nprocs=world_size, join=True)

def train_worker(rank, world_size, cfgs):
    trainer = MyDistributedTask(rank, world_size, cfgs)
    trainer.run()

class MyDistributedTask():
    def make_dataset(self, is_tra_val_te: str):
        # Use process 0 to generate dataset
        if self.rank == 0:
            data_cfgs = self.cfgs["data_cfgs"]
            dataset = MyDataset(data_cfgs, is_tra_val_te)
            return dataset

However, MyDataset generated in process 0 is not sent to other processes, so how can I share MyDataset among processes? And I don’t want to use mp.Lock because it will make program slower and make polars appear deadlock mistakes.

Thank for your reply.