Your posted code could work assuming your model is small and you have enough memory.
Assuming you are summing the losses inside the DataLoader loop, this approach would store the computation graphs, which would allocate more memory in each iteration.
Depending on the size of your dataset, you might be quickly running out of memory.
As an alternative, you could call backward() on each loss in each iteration, so that Autograd can free the intermediate tensors, as they are not needed anymore.
I have additional questions. If you want to use memory efficiently when trying to learn using multiple dataloaders, is there any way to modify the inside of the dataloader?
Or Is there a way to share the queue of each data loader?
You could concatenate the datasets via ConcatDataset and pass it to a single DataLoader.
However, you would have to make sure that the output sizes of all tensors are equal (doesn’t seem to be the case in your example) or you would have to use a custom collate_fn.
If you are using different loaders, as posted in your initial code, then yes, each loader can use different input shapes and batch sizes.
If you concatenate the dataset and use a single loader, the batch size would be set in the DataLoader creation unless you disable automatic batching and customize the loading pipeline.