Is it possible to Free-up a Dataloader?

I am using different Dataloaders for train set and test set, so in all, I have 2 Dataloaders.

I do training and testing in every epoch.

Is there a way I can free up the Dataloader not being used (for eg. free-up the train dataloader while testing; and free-up test dataloader while training) so as to be able to increase the batch size of the Dataloader being used?

From my understanding, the dataloader is just a proxy between your train/test set, and these train and test sets are the variables that eat the memory. So, you will need to delete the train/test set to free up the allocated memory. I could be wrong, but I guess this is how it could be done:

test_loader = None
del test_loader
test_set  = None
del test_set

However, you will have to read the deleted (data)set and the associated loader before trying to use it again. More, you will have to compromise the reading time(s) of these sets with the gain you might get from deleting them.

I think dataloader is just a generator. What takes memory is model weights and the data been yield. The memory of model weights will always be there since you are training them all the time. And about the yield data from dataloader, at end of every iteration, the yield data will be free since it’s a local variable. So I think there is no need to free dataloader.

Hi @Deeply, thanks for the suggestion. I tried this but it is not working. GPU runs out of memory.

Hi @chenglu, thanks for the answer. I think that training and testing batch size matters. I tried out with larger test batch size and it didn’t work, but when I reduce the test batch size, it works. So it seems that even if the dataloader is emptied out at the end of every epoch, it occupies memory.

Have you used with torch.no_grad() during your test phase? This will avoid storing the intermediate variables needed for the backward pass, which is not necessary for testing.

1 Like

Hi @ptrblck, thanks a lot, I am able to increase the batch size after using with torch.no_grad()!

Hi all,

I am running into what I think is a memory leak? Not sure why my memory isn’t getting feed up while I iterate through my dataloader during inference. If anyone has ideas please let me know!

**More Details **

I noticed that the memory appears to accumulate based on the tracemalloc package at:

def __getitem__(self, index)

...
        image = (image / 255).astype(np.float32)
        mask = mask.astype(np.float32)
...

Iteration through inference DataLoader

...
    tracemalloc.start()

    snapshot1 = tracemalloc.take_snapshot()
    for (batch_idx, batch) in tqdm(enumerate(loader), total=len(loader)):
        try:
            if batch_idx % 10 == 0:
                snapshot1 = tracemalloc.take_snapshot()

            log.info(f"Running inference on {batch['image_path']}")

            batch_pred = inference_step(
                model, checkpoint_fp, batch, thresh, model_loaded=True
            )

            res = inference_ds.save_prediction_batch(
                batch_pred,
                out_folder,
                config=cfg,
                thresh=thresh,
                channel=channel,
            )
            successes.append(res)
            log.info("Inference complete")

            if batch_idx % 10 == 9:
                snapshot2 = tracemalloc.take_snapshot()
                top_stats = snapshot2.compare_to(snapshot1, "lineno")

                log.info("\n\n [ Top 10 differences ]")
                for stat in top_stats[:10]:
                    log.info(f"Snapshot difference: \n {stat}")
...

Are you storing tensors attached to the computation graph in e.g. a list?
This would not be a leak, but the increased memory usage would be expected in this case (although this behavior is commonly referred to as a “memory leak”).
Try to narrow down which part of your code is causing the increase in memory usage as e.g. save_prediction_batch and successes.append(res) might store the entire computation graph.

I figured out the issue @ptrblck !

I was using this package GitHub - msamogh/nonechucks: Deal with bad samples in your dataset dynamically, use Transforms as Filters, and more!

With this issue:

Subtle bug to track down :slight_smile: