Custom dataloader using Python Multiprocessing and num_workers in torch dataloader

Hi, I have a custom dataloader. I have explicitly used python’s multiprocessing to parallelize data preprocessing in my custom dataloader. I am using 8 workers(num_threads) in multiprocessing in my dataLoader.
I wanted to know, how will that affect my call?
Will the num_workers argument be set to 8? Or can I leave it at 0?
My custom Loader looks like this

def processParams(params):
        <some operations on params>
        return params

def processParamsParallel(params, pool):
      results =, params)
      return results

class DataLoader(object):
      def __init__(self, params, maxId):
             self.params = params
    = 0
             self.maxId = maxId
             self.pool = Pool(processes=8)
      def __iter__(self):
              = 0
                 results = processParamsParallel(self.params, self.pool)
                 yield results

It’s a very rough example of what I am trying to do.
Now in the torch call

dl = DataLoader(params, 50)
dl_torch =, num_workers = <what_here?>, prefetch_factor = <what_here?>)

On a similar note, how will prefetch_factor be affected given that the num_workers are not set by torch call but by the custom dataLoader itself?


prefetch_factor - Number of samples loaded in advance by each worker. 2 means there will be a total of 2 * num_workers samples prefetched across all workers. (default: 2 )

Thank you in Advance!