Start dataloader at specific batch_idx

I would like to start my data loader at a specific batch_idx.

I want to be able to continue my training from the exact batch_idx where it stopped or crashed.
I don’t use shuffling so it should be possible.

The only solution I came up with is the naive running though the for loop until I get to where I want:

start_batch_idx, ... = load_saved_training()
for batch_idx, (data, target) in enumerate(train_loader):
    if batch_idx < start_batch_idx:
        continue
    # train
    if batch_idx % 100:
        # save training (including batch_idx)
3 Likes

Hi,

Not sure how to do that exactly, but I would suggest to use a custom Sampler based on the SequentialSampler that is used by default when shuffle=False.
In particular, you can create one that will return return only part of the indices. Then recreate a dataloader with the regular sampler for the rest of the training.

1 Like