How to refresh an iterator on DataLoader

I would like to use an iter on DataLoader instead of a for loop. At the beginning of each training step I use the following code:

train_loader = DataLoader(dataset=train_dataset, batch_size=8, shuffle=True, num_workers=4)
train_loader_iter = iter(train_loader)
while(True):
    try:
        data_dict = next(train_loader_iter)
    except StopIteration:
        print('Refreshing iterator...')
        train_loader_iter = iter(train_loader)
        data_dict = next(train_loader_iter)
        print('Iterator refreshed...')

However when my iterator should refresh (i.e. I read ‘Refreshing iterator…’) the process hangs and gets stuck infinitely. Am I doing something wrong or this is just not possible? I’m on Pytorch 1.6.0 and my dataset is a simple subclass of the Dataset class.

Your code is looping infinitely over the data (that’s why it looks like your process is hung). You can limit the number of times you ‘refresh’ your iterator (eg: 100) by replacing while True with for _ in range(100).

That is not the problem I’m having, but exactly what I want. The problem is that the first time i refresh the iterator my process gets stuck. It would happen also if I swap the “while(True)” with “for _ in range(2)”

Hmm, I can’t reproduce this issue on an example dataset. Can you check if you have the same problem with another dataset?