Tried to allocate 5.75 GiB (GPU 0; 6.00 GiB total capacity; 881.81 MiB already allocated; 3.67 GiB free; 988.00 MiB reserved in total by PyTorch)

AmoghM · April 15, 2020, 9:40pm

I am trying to run Allen NLP Elmo model but it doesn’t work due to lack of GPU memory. However, there is no PyTorch session running and also I have tried doing:
import torch
torch.cuda.empty_cache()
and
import gc
gc.collect()

but the error message is the same. How can I fix this? Looks like a common issue with no clear solution for this. Thanks!

albanD · April 15, 2020, 10:02pm

Hi,

It looks like you try to allocate too much memory when running the model compared to what you have.
I guess the model you run is based on pytorch? If so you might want to reduce the batch size used there?

AmoghM · April 15, 2020, 10:20pm

The model is based on PyTorch and I made my batch size as 2 to check if that is an issue. I get the exact same error. I don’t understand how does PyTorch reserve 988 MiB memory. Is it on the fly?

albanD · April 15, 2020, 10:25pm

Well this happens in the middle of the forward I guess? So some data is already allocated by previous code.
But given that you try to allocate 5.75GB for a single Tensor and you have 6GB memory. This looks like it is very unlikely to succeed if you have anything but that Tensor on the GPU

AmoghM · April 28, 2020, 9:10am

The reason was that the input batch was too big for the GPU to accomodate.