Data loading time is nearly proportional to batchsize

Multiple workers try to load the data in the background, while the model training is executed.
If the workers are not fast enough or bottlenecked by e.g. a local (spinning) HDD, they won’t be able to preload the next batch, which seems to be the case for your training routine.
Have a look at this post for more information.