Gpu memory gets accumulated during consecutive forward passes

mailcorahul · November 28, 2019, 10:11am

I am running a densenet161 image classifier on gpu.

After continuous forward passes(basically 2-5 batches of images, batch_size=128), the memory gets accumulated and doesn’t get freed. Can someone tell me what is taking up so much memory?
Btw, I am using torch.no_grad() during forward pass.

@ptrblck It would also be helpful if you can tell the factors which determine memory usage during inference.

Diego · November 28, 2019, 10:47am

Could you post a snippet of the code where you do the forward pass, so that we can take a look?

mailcorahul · November 28, 2019, 11:05am

I am using an Inference class and the following code is one of its member function. Every batch is of size 128.

def predict(image_objs_batches):
    for batch in image_objs_batches:
        with torch.no_grad():
            _, outputs = self.model(batch)

Diego · November 28, 2019, 11:24am

Are you sure the memory leak is on the inference stage? If not could you post the training loop. If you accumulate the losses on a list or something similar without calling .item() on it the gradients will accumulate.

mailcorahul · November 28, 2019, 11:27am

I am facing this memory accumulation issue only during inference. Not for training.

rustagiadi95 · November 29, 2019, 3:23am

Hi there,

_, outputs = self.model(batch) is not within the context manager as per your snippet.
Put a tab on the line to push it within the torch.no_grad() context manager.

Since you asked the other factors, one could be the batch size, in your case this might mot be the solution, but increasing the batch size means computing and keeping more gradients on your device, eventually eating up device memory. This is keeping in mind that you had a pretrained model and did not trained on your device, you are only inferencing.

Thanks

mailcorahul · November 29, 2019, 4:43am

Hey!
Yes, there is a tab. Somehow got rearranged while copying code. Inference is being done with torch.no_grad() context manager.

I am pretty sure gradients are not computed, but gpu memory steadily increases during consecutive forward passes.
Thanks.

Gantavya_Bhatt1 · June 20, 2021, 1:37pm

Hey, any update on this? I am facing the same issue, even with the with torch.no_grad() snippet, with the latest version 1.9.

ptrblck · June 21, 2021, 1:45am

Could you post a minimal and executable code snippet, which would show the increase in memory in PyTorch 1.9.0 using the no_grad guard, please?

Gantavya_Bhatt1 · June 21, 2021, 9:28am

Hi, @ptrblck thanks for your response. I figured out that the issue is not related to the forward pass, but rather PyTorch was caching the memory, and therefore I saw an increased memory footprint on nvidia-smi. I have figured out the solution. Thanks once again.

Zhiyi_Chen · March 23, 2023, 6:32pm

Hi, may I know what is your solution? I have the same issue at inference!

Zige_Wang · June 22, 2023, 12:46pm

Hi, same issue here, does anyone know how to deal with it?