I am trying to run Allen NLP Elmo model but it doesn’t work due to lack of GPU memory. However, there is no PyTorch session running and also I have tried doing: import torch torch.cuda.empty_cache()
and import gc gc.collect()
but the error message is the same. How can I fix this? Looks like a common issue with no clear solution for this. Thanks!
It looks like you try to allocate too much memory when running the model compared to what you have.
I guess the model you run is based on pytorch? If so you might want to reduce the batch size used there?
The model is based on PyTorch and I made my batch size as 2 to check if that is an issue. I get the exact same error. I don’t understand how does PyTorch reserve 988 MiB memory. Is it on the fly?
Well this happens in the middle of the forward I guess? So some data is already allocated by previous code.
But given that you try to allocate 5.75GB for a single Tensor and you have 6GB memory. This looks like it is very unlikely to succeed if you have anything but that Tensor on the GPU