Cuda Out of Memory: Implementing a Paper

Aashish_Bhandari · November 13, 2022, 10:57pm

I have an M1 Mac, and I cannot use Cuda properly. I have used Kaggle for this purpose. I am trying to implement this paper: https://github.com/cuhksz-nlp/R2GenCMN

I am constantly getting the error:

Traceback (most recent call last):
File “main.py”, line 135, in
main()
File “main.py”, line 131, in main
trainer.train()
File “/kaggle/working/1R2GenCMN/modules/trainer.py”, line 58, in train
result = self._train_epoch(epoch)
File “/kaggle/working/1R2GenCMN/modules/trainer.py”, line 184, in _train_epoch
loss.backward()
File “/opt/conda/lib/python3.7/site-packages/torch/_tensor.py”, line 363, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File “/opt/conda/lib/python3.7/site-packages/torch/autograd/init.py”, line 175, in backward
allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to run the backward pass
RuntimeError: CUDA out of memory. Tried to allocate 6.12 GiB (GPU 0; 14.76 GiB total capacity; 4.51 GiB already allocated; 5.53 GiB free; 8.17 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

How to solve this in Kaggle?