Differential privacy library opacus in Pytorch error

Hello, I have been trying to implement Opacus to my models optimiser. I have run it on colab. At the time of training I run into this error.
RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)
Could someone please guide me on this.
Thanks!

This error might be thrown if you are running out of memory and cublas isn’t able to create its handle.
Could you check the memory usage via nvidia-smi and check if you are close to the device limit?
If that’s the case, try to lower the batch size and rerun the code.