I am training a model with two dataloaders in an alternate optimization fashion.
The algorithm looks like this:
def train(K: number of steps for loss2):
for batch in dataloader1:
for step, batch in enumerate(dataloader2):
if step > K:
Now even with k=1, the training is very slow. I have checked the training time by removing the inner loop, and it finishes relatively quickly. Roughly, without a nested loop, it takes around 7 minutes to finish, and with a nested loop, it takes around 45 mins to finish. With higher k values, it’s much slower. I have tried restricting the number of workers on dataloader2 to 1, but it doesn’t make much of a difference.
Any suggestion on optimizing the runtime in such a setup?
Wouldn’t this be expected assuming both DataLoaders have approx. the same number of elements?
For each batch in dataloader1 you are iterating the complete dataloader2.
Assuming both DataLoaders have 7 batches (each one takes ~1min to load and process) you would use (1+7)*7=56 iterations, thus ~56mins.