Hello everyone. I am trying to make CUDA work on open AI whisper release. My current setup works just fine with CPU and I use medium.en model
I have installed CUDA-enabled Pytorch on Windows 10 computer however when I try speech-to-text decoding with CUDA enabled it fails due to ram error
RuntimeError: CUDA out of memory. Tried to allocate 70.00 MiB (GPU 0; 4.00 GiB total capacity; 2.87 GiB already allocated; 0 bytes free; 2.88 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
I disabled and enabled the graphic card before running the code - thus the VGA ram was 100% empty. Tested on my laptop so it has another GPU as well. Therefore, my GTX 1050 was literally using 0 MB of memory already
My computer also has 32 GB ram and CPU synthesis working very well but just too slow. In 7 hours processed only 1 hour of speech.
My installed version :
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: pytorch org
Author: PyTorch Team
Author-email: packages pytorch org
Required-by: whisper, torchvision, torchaudio
Do you have any suggestions to solve this problem?
The problem could be the GPU memory used from loading all the Kernels PyTorch comes with taking a good chunk of memory, you can try that by loading PyTorch and generating a small CUDA tensor and then check how much memory it uses vs. how much PyTorch says it has allocated.
There has been work to put PyTorch on a bit of a diet there (e.g. the “JITerator”) but I’m not sure about the state, in particular on Windows.
Fun fact: In the olden times, PyTorch would print “Buy more RAM” along with the error message, but then things got all serious.