Why does torch.add function increase the memory usage of GPU?

Hi,
For the following function,
the shape of l and r are both (1, 64, 640, 360), and the device are both ‘cuda:0’ and the datatype are both torch.float32.

the problem is: once the function is executed, the used memory of GPU increased by 58MB.
why does torch.add function increase the memory usage of GPU?

I tested it many times, and 58MB more memory was increased every time.

even if I use l+r instead of torch.add(l, r), the memory used increased 58MB. :sweat_smile:

def _add(self, *inputs):
        l, r = inputs
        return torch.add(l, r)

hi,
maybe it’s because of retaining input for computing grad, if you don’t need to call backward, you can use

@torch.no_grad()
def _add(self, *inputs):
        l, r = inputs
        return torch.add(l, r)

i’m not sure about it.

I tried @torch.no_grad(). It is not that reason.
I tested the function for forward, but not backward.

then maybe you can use inplace operation.

l.add_(r)

Nice!!!
the inplace operation works. GPU memory doesn’t increase any longer.
during forward, the memory of add operation doesn’t increse any longer.
however, during backward, I experienced another issue:

Exception has occurred: RuntimeError one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 64, 640, 360]], which is output 0 of AsStridedBackward, is at version 9; expected version 8 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

you can’t use backward when using In-place operation.

I am considering why torch.add will increase 58MB memory usage?
how about some other operation, such as, torch.divide, torch.mul and etc.?