Cuda out of memory error -BERT

I am training BERT model for sentiment analysis, with train data size 80k, but getting out of memory error for batch size 128,256 and above.

Here is the stack trace,

/usr/local/lib/python3.6/dist-packages/transformers/modeling_bert.py in forward(self, hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask, output_attentions)
337 encoder_hidden_states,
338 encoder_attention_mask,
–> 339 output_attentions,
340 )
341 attention_output = self.output(self_outputs[0], hidden_states)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
725 result = self._slow_forward(*input, **kwargs)
726 else:
–> 727 result = self.forward(*input, **kwargs)
728 for hook in itertools.chain(
729 _global_forward_hooks.values(),

/usr/local/lib/python3.6/dist-packages/transformers/modeling_bert.py in forward(self, hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask, output_attentions)
257 # Take the dot product between “query” and “key” to get the raw attention scores.
258 attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
–> 259 attention_scores = attention_scores / math.sqrt(self.attention_head_size)
260 if attention_mask is not None:
261 # Apply the attention mask is (precomputed for all layers in BertModel forward() function)

RuntimeError: CUDA out of memory. Tried to allocate 844.00 MiB (GPU 0; 15.90 GiB total capacity; 14.36 GiB already allocated; 377.88 MiB free; 14.63 GiB reserved in total by PyTorch)

Can someone please suggest on how to resolve this.
I am using Colab GPU, is there any limit on size of training data for GPU with 15gb RAM?

Thanks

Yes, there would always be an upper limit, which fits on your GPU (and of course also system RAM).
You might need to lower the batch size for the particular GPU you are using, if you are running out of memory.