How to avoid defragmentation?

CUDA out of memory. Tried to allocate 6.85 GiB (GPU 0; 23.69 GiB total capacity; 9.79 GiB already allocated; 2.73 GiB free; 16.13 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

The message says that the reserved memory is significantly large than the allocated memory, we should handle the fragmentation. What needs to be done to handle the fragmentation?

Can someone please suggest how to avoid this issue? I have already tried freeing the cache and I have blocked the splitting of the blocks by export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:128. But it doesn’t help.

You may wanna get info from how GPU memory is allocated first by using

torch.cuda.memory_summary(device=None, abbreviated=False)

You can also try reduce your model size, batch size…