Too much time Dataloader to run

`def training(epoch, net, device, criterion, train_data, optimizer,tb):
“”"
Run one training epoch
:param epoch: Current epoch
:param net: Network
:param device: Torch device
:param criterion: function to evaluate the loss
:param train_data: Training Dataset
:param optimizer: Optimizer
:return: Average Losses for Epoch
“”"

results = 0
batch_idx = 0
# change flag of training
net.train()

tic = time.perf_counter()

for sample in train_data:  
    
    toc = time.perf_counter()      
    
    batch_idx +=1

    # load of the normalized image
    img = sample['img'].to(device)
    
    # load of the ground bounding boxes
    ground_bb = sample['bb'].to(device)`

Each cycle takes too much time at loading the data. In fact trying to time the interval between before the for loop and the initial of the for loop it takes about 1 minute. How I can speed this each cycle?

The first iteration can be slower, since the first batch would have to be loaded. However, the following iterations should be faster, if the actual model workload is not tiny.
If you want to avoid this slowdown in each epoch, you could use this workaround.

1 Like

Thank you so much. However I have a slowdown not only at the beginning of every epoch but also in each cycle of for sample in train_data. I think this slowdown is caused by the data transformation (rotation and zooming) inside the__getitem__ method of my custom dataset. How I can perform in parallel data loading and training?

If you are using multiple workers in the DataLoader via num_workers>=1, each worker will load the batch in the background while the GPU is busy with the model training.

I have done it setting num_workers = 8. However for 8 training cycle I have not to wait the loading of the data but after that the for loop of training stops and wait for loading of data.

If you are seeing the slowdown after a full epoch, please refer to my previous post for a workaround.

unfortunately I see a slowdown after each batch

This could indicate that your training loop might be increasing the computation graph and thus not the data loading but the training is slower.
Could you remove the complete model and training, and just time the data loading loop?

Thank you for your answer. I have found the problem. In getitem method in my custom dataset images transformations are performed using function from skiimage that takes too much time to perform. I have substituted them with opencv functions that performs more or less the same operations very very faster. In this way loading the batches are more fast and It solves my problem