Different batches take different times to load

I’m using the snippet given below to measure the amount of time each batch takes to load.

st_batch= time.time()
torch.cuda.synchronize()
                
for cnt, batch in enumerate(train_data_loader):
            torch.cuda.synchronize()
            end_batch= time.time()
           print(end_batch-st_batch)
          # do something
           st_batch=time.time()

I observe that some batches take 3-5 seconds whereas others take just 0.002s

I assume this is a bottleneck while training my network is there anyway around this?

Are you using multiple workers in your DataLoader?
If so, could you try to increase the number?
It looks like your training procedure has to wait sometimes for the DataLoader to provide new batches. Also, is your data locally on an SSD?

There was an overhead in my getitem function , I fixed it and it works fine now.

Hi @Pramodith , can you share what the overhead in your getitem was?