[Urgent] - RAM Memory Leak iterating a short-lived dataloader

Hi, I am experiencing a RAM (not VRAM) memory leak issue while using next(iter(dataloader)) on a short lived dataloader.

I am training a model in the context of Meta-Learning Few-Shot Classification and I have to randomly sample different classes AND images of those classes for each step of my training loop.

I am therefore creating a new Dataloader for each training step which only contains a single batch.

I already saw this thread that addresses a similar problem, but I think it does not apply to mine because I have to create a new DataLoader per training step: Get a single batch from DataLoader without iterating · Issue #1917 · pytorch/pytorch · GitHub

I do the following per training step:

# Create my task (collection of randomly sampled classes)
task = task_type(meta_train_classes, ...)
# Fetch DataLoader for that task
dataloader = fetch_dataloaders('train', task)
...
# Iterate dataloader (causes memory leak)
X_sup, Y_sup = next(iter(dataloader))

I also use numpy arrays in my custom Dataset class (and from memory profilings, memory does not seem to leak from __getitem__()).

I am wondering if I could be doing this in any other way to prevent this memory leak. I have to train my model for a huge number of steps and my 64GB of RAM fills ups slowly, but fast enough that I can’t fully train the model.

Do you make you sure the old DataLoader is deleted? That sounds like that could be a potential leak!

Yes, I have already tried using both del dataloader and the garbage collector, but none have worked, unfortunately

Are you using TensorBoard or TensorBoardX in your code? I had a similar issue where I couldn’t find bug in the dataloader. However, the tensorboardX was consuming RAM slowly.

1 Like