I’m doing training on data where the collate()
function needs relatively heavy computation (some sequence packing).
Right now I am training with around 40 dataloader workers, but still experiencing locks as the main thread waits for data.
I noticed the workers each call torch.set_num_threads(1)
. Is there a reason for that (apart from limiting the number of threads). Is it ok to raise the number of threads each worker can use? (by eg calling torch.set_num_threads
in the worker_init_fn
)?