Memory error when trying to train with two different dataloaders in parallel

I am trying to train my model using 2 dataloaders from different image datasets.

I use cycle() and zip() to do so because my datasets are not the same length from here: How to iterate over two dataloaders simultaneously using pytorch?

But I am getting the following error:

  File "/home/Desktop/example/", line 229, in train_2
    for i, (x1, x2) in enumerate(zip(cycle(train_loader_1), train_loader_2)):
  File "/home/.conda/envs/3dcnn/lib/python3.7/site-packages/torch/utils/data/", line 346, in __next__
    data = self.dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/home/.conda/envs/3dcnn/lib/python3.7/site-packages/torch/utils/data/_utils/", line 47, in fetch
    return self.collate_fn(data)
  File "/home/.conda/envs/3dcnn/lib/python3.7/site-packages/torch/utils/data/_utils/", line 80, in default_collate
    return [default_collate(samples) for samples in transposed]
  File "/home/.conda/envs/3dcnn/lib/python3.7/site-packages/torch/utils/data/_utils/", line 80, in <listcomp>
    return [default_collate(samples) for samples in transposed]
  File "/home/.conda/envs/3dcnn/lib/python3.7/site-packages/torch/utils/data/_utils/", line 56, in default_collate
    return torch.stack(batch, 0, out=out)
RuntimeError: [enforce fail at CPUAllocator.cpp:64] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 154140672 bytes. Error code 12 (Cannot allocate memory)

I tried to solve that by setting num_workers=0 , decreasing the batch size, using pinned_memory=False and shuffle=False … But none of it worked…

My current setup is 256GB of RAM and 4 NVIDIA TESLA V100 GPUs and when I train my model with the two datasets but not in parallel - i.e. i train one epoch first on the one dataloader and then on the other one- it works totally fine (I consume only 21GB ) .

Is there a more efficient way to have the same functionality as when using enumerate(zip(cycle(train_loader_1), train_loader_2)): ?