I am getting the following error message in the vscode app, when I run a jupyter notebook, using python 3.11.3 as a jupyter kernel:
‘“name”: “OutOfMemoryError”,
“message”: “CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 8.00 GiB total capacity; 7.00 GiB already allocated; 0 bytes free; 7.13 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF”’
I ran the following iterative tests, disabling each GPU one by one:
Test 1.
NVIDIA GeForceRTX 2080 1 disabled, 2, 3 and 4 enabled. Run the segment/mask step in the ‘Segment Anything Model’ jupyter notebook in VSCODE. Run successful, no errors.
Test 2.
NVIDIA GeForceRTX 2080 2 disabled, 1, 3 and 4 enabled. Run the segment/mask step in the ‘Segment Anything Model’ jupyter notebook in VSCODE. Run successful, no errors.
Test 3.
NVIDIA GeForceRTX 2080 3 disabled, 1, 2 and 4 enabled. Run the segment/mask step in the ‘Segment Anything Model’ jupyter notebook in VSCODE. Run unsuccessful, CUDA Out of memory error (‘Test 3 error’ attached).
Test 4.
NVIDIA GeForceRTX 2080 4 disabled, 1, 2 and 3 enabled. Run the segment/mask step in the ‘Segment Anything Model’ jupyter notebook in VSCODE. Run unsuccessful, Unknown error ‘Test 4, 5 and 6 error’ attached.
Test 5.
NVIDIA GeForceRTX 2080 1 disabled, 2, 3 and 4 enabled. Run the segment/mask step in the ‘Segment Anything Model’ jupyter notebook in VSCODE. Run unsuccessful, Unknown error ‘Test 4, 5 and 6 error’ attached.
Test 6.
NVIDIA GeForceRTX 2080 2 disabled, 1, 3 and 4 enabled. Run the segment/mask step in the ‘Segment Anything Model’ jupyter notebook in VSCODE. Run unsuccessful, Unknown error ‘Test 4, 5 and 6 error’ attached.
To me these results don’t point to an issue with a specific GPU, but instead to a problem that is exacerbated as the number of runs increases. Perhaps a memory cache issue… although I have integrated a line to clear the cache, and the error persists. Has anyone come across this before?
Thanks,
Karl