Tried to allocate 5.75 GiB (GPU 0; 6.00 GiB total capacity; 881.81 MiB already allocated; 3.67 GiB free; 988.00 MiB reserved in total by PyTorch)

I am trying to run Allen NLP Elmo model but it doesn’t work due to lack of GPU memory. However, there is no PyTorch session running and also I have tried doing:
import torch
torch.cuda.empty_cache()
and
import gc
gc.collect()

but the error message is the same. How can I fix this? Looks like a common issue with no clear solution for this. Thanks!

1 Like

Hi,

It looks like you try to allocate too much memory when running the model compared to what you have.
I guess the model you run is based on pytorch? If so you might want to reduce the batch size used there?

The model is based on PyTorch and I made my batch size as 2 to check if that is an issue. I get the exact same error. I don’t understand how does PyTorch reserve 988 MiB memory. Is it on the fly?

Well this happens in the middle of the forward I guess? So some data is already allocated by previous code.
But given that you try to allocate 5.75GB for a single Tensor and you have 6GB memory. This looks like it is very unlikely to succeed if you have anything but that Tensor on the GPU :confused:

1 Like

The reason was that the input batch was too big for the GPU to accomodate.

1 Like