GPU memory that model uses

How can we simple calculate the GPU memory model (nn.Module) use? Just a single GPU unit.

2 Likes

To calculate the memory requirement for all parameters and buffers, you could simply sum the number of these and multiply by the element size:

mem_params = sum([param.nelement()*param.element_size() for param in model.parameters()])
mem_bufs = sum([buf.nelement()*buf.element_size() for buf in model.buffers()])
mem = mem_params + mem_bufs # in bytes

However, this will not include the peak memory usage for the forward and backward pass (if that’s what you are looking for).

7 Likes

What’s the peak memory usage?

During training you are using intermediate tensors needed to backpropagate and calculate the gradients. These intermediate tensors will be freed once the gradients were calculated (and you haven’t used retain_graph=True), so you’ll see more memory usage during training than the initial model parameters and buffers would use.

2 Likes

Is peak memory usage equivalent to forward/backward pass size here?

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1           [-1, 10, 24, 24]             260
            Conv2d-2             [-1, 20, 8, 8]           5,020
         Dropout2d-3             [-1, 20, 8, 8]               0
            Linear-4                   [-1, 50]          16,050
            Linear-5                   [-1, 10]             510
================================================================
Total params: 21,840
Trainable params: 21,840
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.06
Params size (MB): 0.08
Estimated Total Size (MB): 0.15
----------------------------------------------------------------

It might be, but I’m not sure which utility you are using and how it estimates the memory usage.

@ptrblck, How can we measure the peak memory usage? This sounds like the most important question not to break into CUDA out of memory errors. Should I ask the separate question for this?

1 Like

torch.Cuda.max_memory_allocated() should give you the max value. I’m not sure, if your currently used logging library gives a matching number, but it would be interesting to see.

1 Like

Thanks, it was:

import torch
torch.cuda.max_memory_allocated()

This can help me figure out the max batch size I can use on a model, hopefully. But I wonder if something similar is present in PyTorch already.

However, I am not sure if this thing will also count the memory in the garbage collector that can be free after gc.collect().

Maybe this is called cache.

These intermediate tensors will be freed once the gradients were calculated (and you haven’t used retain_graph=True )

Could you provide the link to the exact lines in the source? I need to investigate this part. Thank you a lot for helping people here!

1 Like