Understanding workers

Hello, I have a doubt with dataoader. If batch size = 1 and workers = 0, when iterating the dataset, when printing I receive one print output.
But when batch size = 1 and workers = 1 I receive two print outputs, is it normal? I thought I should receive also one. And when, for example, batch size = 2 and workers = 2 I should receive 4 but I receive 8… Could you please explain it to me?

Maybe the image will help you to understand my question.
Thank you.

Hi,

When workers = 0, the loading happens synchronously on the same process and so each batch is loaded when asked for.

When workers > 0, since loading is done in other processes, we have more freedom. In particular, we preload future batches in advance so that when you ask for them, they are (hopefully) already ready. And we load the next batch while you’re processing the one you just got.

Thanks albanD,

Is it defined how many future batches in advance are loaded? Just curiosity.

IIRC 2 * workers, see code here.

1 Like