DataLoader num_workers vs torch.set_num_threads

Is there a difference between the parallelization that takes place between these two options? I’m assuming num_workers is solely concerned with the parallelizing the data loading. But is setting torch.set_num_threads for training in general? Trying to understand the difference between these options. Thanks!


Your assumption is correct:

  • num_workers will set the number of processes (EDITED thanks @SimonW) used to load and preprocess data in the dataloader
  • set_num_threads sets the number of threads that can be used to perform cpu operations like conv or mm (usually used by OpenMP or MKL).

Also, data loader’s num_workers specifies the number of multiprocessing workers, and has nothing to do with threads.