PyTorch Startup Memory Occupation

maralm · June 19, 2019, 10:48pm

Hi,

I have a simple linear network class in pytorch with 6000 hidden units.
When I call the following lines, it occupies 893MB memory.
device = torch.device(“cuda”)
model = DeepNN().to(device)

where DeepNN is the class name. This initial memory will increase by increasing the number of hidden units.
Wondering what this means and is there any way to fix it?

Update: To give some more information, with a network with nn.Linear(6000,6000) , I expect the initial memory to be 144MB but it is currently 600MB. When I increase the size of hidden units to nn.Linear(30000,30000), I expect the memory to be around 3.5GB but it preoccupies 5GB.

JuanFMontesinos · June 19, 2019, 10:53pm

Basically the bigger your net is the more memory it requires.

maralm · June 19, 2019, 11:44pm

I understand. My problem is that why it preoccupies the memory before running the actual computations? Is that normal?
Even with a one layer network, it constantly occupies this amount of memory.

JuanFMontesinos · June 20, 2019, 12:06am

Cuda requires some memory to initialize libraries. It depends on gpu model. Around 600 Mb

maralm · June 20, 2019, 12:09am

Thanks for the reply.
What I see is that the memory preoccupied keeps increasing by increasing the number of hidden units. With 20000 hidden units, it occupies around 3GB memory.
Is there any way to avoid it? I am basically loosing memory required for computation.

ptrblck · June 20, 2019, 12:19am

The parameters have to be stored on the GPU so that the computation itself can be performed on the device.
As @JuanFMontesinos said, the CUDA context will use some memory besides that.

maralm · June 20, 2019, 10:23pm

If it is cuda context memory, why it is related to the number of hidden units? If it is for parameters, the numbers I am getting is not what calculations say. I edited my question with some examples.

ptrblck · June 21, 2019, 11:43am

Thanks for the additional information!
Could you run the following code and check the memory usage:

import torch
import torch.nn as nn

torch.cuda.synchronize()  # Create CUDA context (947MB in my case)

model = nn.Linear(6000, 6000).to('cuda')
expected_mem = (model.in_features * model.out_features + model.out_features) * 4 / 1024**2
print('Max allocated {:.3f}MB, expected {:.3f}MB'.format(
    torch.cuda.max_memory_allocated() / 1024**2,
    expected_mem))

model = nn.Linear(30000, 30000).to('cuda')
expected_mem = (model.in_features * model.out_features + model.out_features) * 4 / 1024**2
print('Max allocated {:.3f}MB, expected {:.3f}MB'.format(
    torch.cuda.max_memory_allocated() / 1024**2,
    expected_mem))

> Max allocated 138.023MB, expected 137.352MB
> Max allocated 3572.138MB, expected 3433.342MB

On my system, the CUDA context uses a constant memory of ~950MB, while the model parameters fill up additional memory.

maralm · June 27, 2019, 9:12pm

Thanks. Problem solved. There was a bug in model definition.