RuntimeError: CUDA out of memory during training

My model has 195465 trainable parameters and when I start my training loop with batch_size = 1 the loop works. But when I try to increase the batch_size to even 2 then the cuda goes out of memory.

I tried to check status of my gpu using this block of code

device = torch.device(‘cuda’ if torch.cuda.is_available() else ‘cpu’)
print(‘Using device:’, device)

#Additional Info when using cuda
if device.type == 'cuda':
    print('Memory Usage:')
    print('Allocated:', round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')
    print('Cached:   ', round(torch.cuda.memory_cached(0)/1024**3,1), 'GB')

which gave the following output:-

Using device: cuda

GeForce GTX 1050
Memory Usage:
Allocated: 2.9 GB
Cached:    2.9 GB

Can anyone help me understand the above output and why is this happening?

Hey bolt25,

did you try torch.cuda.empty_cache()?

Yes I did, but no use

During the forward pass intermediate tensors will be stored (output activations), which are needed for the gradient calculation during the backward pass.
Depending on the model architecture these activations might use more memory than the model parameters.
You could check the peak memory via torch.cuda.max_memory_allocated() using a batch size of 1.

Calling torch.cuda.empty_cache() will not avoid the OOM memory issue, but will instead only slow down your code.

1 Like