GPU running out of memory

Izu97 · March 18, 2020, 8:54am

I try to run CNN model on GPU with the input shape of (3,224,224) .It occur the following issues . Here is the nvidia-smi output. How I can free up the GPU memory. Thank you.
nvidia_smi

Error Msg:
data. defaultcpuallocator: not enough memory: you tried to allocate 34798181769216 bytes. buy new ram!

ptrblck · March 18, 2020, 9:36am

The error message points to your system RAM, not the GPU memory.
It seems you are trying to create a huge tensor on the CPU.
Could you post the line of code, which raises this issue?

braindotai · March 18, 2020, 9:48am

One simple solution is you can lower the batch size until everything works.
As what @ptrblck has said its a cpu allocation issue, try using gpu by calling .cuda() to your model and dataset.
And if you still get error in this case by using gpu then try freeing memory allocated at gpu using torch.cuda.empty_cache() after every epoch or batch iteration.
Other wise I’d recommend you using gradient accumulation, more about it here, using this you can your train model using data with bigger batch size even if your gpu doesn’t have that much memory.

ptrblck · March 18, 2020, 9:54am

If the number of correctly calculated, this would result in 35TB, which seems to be quite high and I guess the code might have some bug/typo somewhere.

Izu97 · March 18, 2020, 10:16am

I just used .to(device) method to assign tensors to GPU.

Izu97 · March 18, 2020, 10:21am

Now I got this error:
RuntimeError: CUDA out of memory. Tried to allocate 588.00 MiB (GPU 0; 4.00 GiB total capacity; 2.95 GiB already allocated; 150.76 MiB free; 2.97 GiB reserved in total by PyTorch)

and optimizer.step() generate the error.

braindotai · March 18, 2020, 11:32am

What is your batch size? I think its too high for your gpu to allocate to its memory. As I said use gradient accumulation to train your model.
If you want to train with batch size of desired_batch_size, then divide it by a reasonable number like 4 or 8 or 16…, this number is know as accumtulation_steps. Now change your batch size for the dataset to desired_batch_size/accumulation_steps and train your model as below

for epoch in range(epochs):
    for i, (inputs, labels) in enumerate(training_set):
        predictions = model(inputs)                     # Forward pass
        loss = loss_function(predictions, labels)       # Compute loss function
        loss = loss / accumulation_steps                # Normalize our loss (if averaged)
        loss.backward()                                 # Backward pass
        if (i+1) % accumulation_steps == 0:             # Wait for several backward steps
            optimizer.step()                            # Now we can do an optimizer step
            model.zero_grad()                           # Reset gradients tensors