Multiprocessing/dataloader performance - file descriptor vs file system


been trying to speed up training time by optimizing the dataloader configuration while running inside a docker.
at first, I was limited by permission issues and the number of open file descriptors to using the ‘file system’ option. without it, my code was crashing on the first batch once using more than one worker.
later, I was able to raise the docker shm_size and also the number of open file descriptors so it was feasible to use the default ‘file descriptor’ option.
I was expecting a speed up in performance as it is the default and recommended option, but was getting and 2x slowdown.
any clues/ideas?

How many workers are you using in your current DataLoader?
I assume you are comparing it to a DataLoader with num_workers=0 and see the 2x slowdown?

your assumption is wrong :wink:

been trying many configurations to check how they affect the run time.
compared different batch sizes in comparison to different num of workers.
then compared the same configuration(batch size 256 / 10 workers) with file descriptor vs file system

So you see a 2x slowdown using a DataLoader with batch_size=256 and num_workers=10 in comparison to “file system”?
Could you post some more information on your use cases as well as your system and when you are seeing a slowdown (how large etc.)?