RAM gets filled even though everything is shifted to GPU

Hello all,
I have a model and a dataset class. On a fresh boot, the system takes up 1.8GB of ram with no process running. The dataset class upon initialization takes up an additional 2 ~2.5 GB as it stores some variables for further reference. I instantiate the model class and pass it to the training function which looks as follows -

def trainer(model, train_dataloader, val_dataloader, num_epochs):
    torch.backends.cudnn.benchmark = True 
    model.train()
    model.cuda()
    optimizer = torch.optim.Adam(model.parameters(), lr=1e-3, weight_decay=0.00009)
    criterion = nn.CrossEntropyLoss().cuda() 
        model.train()
        epoch_loss_train = 0
        epoch_acc_train = 0

        for _, (image, label) in enumerate(train_dataloader):
            optimizer.zero_grad()
            image = image.cuda()
            label = label.cuda()
            output = model(image)

            loss = criterion(output, label)
            loss.backward()  
            optimizer.step()
            
            del image, label

As seen above, the model is shifted to the GPU and the dataloader returns the image and label which are shifted on to the GPU. The dataloader’s runs on a single thread and the system monitor reflects that.
Once the training loop starts around 2.8 GB of GPU memory is utilized. However RAM gets filled up to 7 GB. I was wondering where is this additional 2.5 GB is coming from ??

Hi,

Can you try checking the ram usage if you just do a simple cuda op like torch.rand(10, device="cuda")? The cpu size memory usage of the cuda driver is known to be very large. :confused:

Damn you are right, my ram usage shot up from 3.1GB to 5.2GB. So this is where the extra 2.1GB comes from.

1 Like