OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB

OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 79.35 GiB total capacity; 32.78 GiB already allocated; 19.19 MiB free; 33.10 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I am trying to fine tune whisper model.Its throwing this error.

I tried with small number of dataset.In collab its works in pro version but this is not working on DGX A100 server.

Need help on this.I am new to this.

I tried setting os.environ[‘PYTORCH_CUDA_ALLOC_CONF’] = ‘max_split_size_mb=100’
But no chnage in the error

Based on the error message it seems another process is using the GPU memory and is thus not allowing PyTorch to allocate more, so check if something else is being executed or if e.g. a dead process might use the memory.