Nested loop over two dataloaders


I am training a model with two dataloaders in an alternate optimization fashion.

The algorithm looks like this:

def train(K: number of steps for loss2):
    for batch in dataloader1:
           for step, batch in enumerate(dataloader2):
                  if step > K:

Now even with k=1, the training is very slow. I have checked the training time by removing the inner loop, and it finishes relatively quickly. Roughly, without a nested loop, it takes around 7 minutes to finish, and with a nested loop, it takes around 45 mins to finish. With higher k values, it’s much slower. I have tried restricting the number of workers on dataloader2 to 1, but it doesn’t make much of a difference.

Any suggestion on optimizing the runtime in such a setup?

Wouldn’t this be expected assuming both DataLoaders have approx. the same number of elements?
For each batch in dataloader1 you are iterating the complete dataloader2.
Assuming both DataLoaders have 7 batches (each one takes ~1min to load and process) you would use (1+7)*7=56 iterations, thus ~56mins.

Thanks for your response.

Sorry, I was a bit unclear in the question.

For each batch of dataloader1, I am sampling K steps (K varies from [1, 20]) only from dataloader2 and optimizing a loss.

Even for K=1, i.e. one batch from dataloader1 and one batch from dataloader2, it is taking a lot of time.