CUDA out of memory while running UniLM

I am trying to run the code on UniLM on https://github.com/microsoft/unilm for fine-tuning, under Abstractive Summarization - [Gigaword] (10K)
but as I was running the code, i encountered this error

RuntimeError: CUDA out of memory. Tried to allocate 98.00 MiB (GPU 0; 10.92 GiB total capacity; 10.01 GiB already allocated; 67.56 MiB free; 10.29 GiB reserved in total by PyTorch)

Can anyone assist me in explaining what is going on and even better, solve it with me.

Much Appreciated!

Your model and training might use too much device memory, which will yield this error.
Try to reduce the batch size to lower the memory footprint.
If that’s not possible, you could try to use torch.utils.checkpoint to trade compute for memory.

Also, check the available memory of your GPU via nvidia-smi and, if possible, stop other processes which might take some memory.