MultiProcessingReadingService problem with tar files

Hi,

I have a webdataset that is composed of tar files. I created a pipeline to use the dataset. Below is the code I use, decode is a function to do some preprocessing on the data (images and their caption in my case).

dp = FileOpener(list(braceexpand(data_path + "/{00000..05000}.tar")), mode="b") 
dp = dp.load_from_tar(length=datasetLength).webdataset()
dp = dp.shuffle().sharding_filter()
dp.apply_sharding(num_processes, process_index, sharding_group=SHARDING_PRIORITIES.DISTRIBUTED)
dp = dp.map(decode)
dp = dp.batch(batch_size=batch_size, drop_last=True)

trainLoader = DataLoader2(dp)

It works fine but then I tried to use the MultiProcessingReadingService to make data loading faster. By doing that, I run into a pickle error.

Process ForkProcess-1:
Traceback (most recent call last):
  File "/azureml-envs/azureml_99407ef20b35f1d5e9103d8f1bfac59a/lib/python3.8/site-packages/torch/utils/data/graph.py", line 67, in _list_connected_datapipes
    p.dump(scan_obj)
TypeError: cannot pickle 'ExFileObject' object

I have dill installed but it doesn’t change anything.
Does anyone know what I am doing wrong ?

Thanks in advance,
Corentin