Every "n" batches, there is a very slow batch

With num_workers = 8, here are the times it takes to load each batch (I am reading image files from disk with an imagefolder):

25.684549808502197
7.860300064086914
0.0003273487091064453
2.17626690864563
0.00019550323486328125
0.00014281272888183594
1.0835604667663574
7.271766662597656e-05
10.315490961074829
1.0640451908111572
2.1994986534118652
0.00010514259338378906
2.168898582458496
0.00011897087097167969
2.2624127864837646
0.0001761913299560547
19.841031312942505
2.242060899734497
0.00010609626770019531
4.744529724121094e-05
2.2600150108337402
0.00014209747314453125
2.28255558013916
8.440017700195312e-05
21.11872434616089
1.1199963092803955
0.00010371208190917969
0.00013947486877441406
2.335517644882202
0.00017642974853515625
2.341693639755249
0.0001652240753173828
19.975724935531616
2.2457833290100098
0.00010275840759277344
0.00010085105895996094
2.3120276927948
0.0001964569091796875
2.362861394882202
0.00013828277587890625
21.159653186798096
1.117790937423706
0.000133514404296875

Every num_workers, there is a very, very slow batch. Why is this? How to fix it?

The workers might not be fast enough in filling their queue with new batches and your training loop would need to wait for them.