Trying to train, I get
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1002.00 MiB. GPU 0 has a total capacity of 15.69 GiB of which 788.88 MiB is free.
Recommended answer is
torch.cuda.empty_cache()
All I get when I enter this command is
A greater-than cursor
Ubuntu
Using Meta-Llama-3-8B-Instruct and Oogabooga to train.
16G GPU - nothing signifigant should be running.
NVIDIA-SMI 550.107.02 Driver Version: 550.107.02 CUDA Version: 12.4 |
|-----------------------------------------±-----------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4070 … Off | 00000000:01:00.0 On | N/A |
| 0% 35C P8 7W / 285W | 15273MiB / 16376MiB | 3% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+
±----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 2501 G /usr/lib/xorg/Xorg 195MiB |
| 0 N/A N/A 2884 C+G …libexec/gnome-remote-desktop-daemon 211MiB |
| 0 N/A N/A 2937 G /usr/bin/gnome-shell 26MiB |
| 0 N/A N/A 3807 G /opt/teamviewer/tv_bin/TeamViewer 16MiB |
| 0 N/A N/A 4551 G …AAAAAAAACAAAAAAAAAA= --shared-files 25MiB |
| 0 N/A N/A 4646 G /usr/bin/nautilus 27MiB |
| 0 N/A N/A 5920 G …seed-version=20240913-050142.817000 85MiB |
| 0 N/A N/A 14891 C python 14652MiB |
Suggestions appreciated.