Data loading time is nearly proportional to batchsize

ptrblck · October 31, 2020, 2:53am

Multiple workers try to load the data in the background, while the model training is executed.
If the workers are not fast enough or bottlenecked by e.g. a local (spinning) HDD, they won’t be able to preload the next batch, which seems to be the case for your training routine.
Have a look at this post for more information.