Out of memory error with same code on Bigger GPU

Hi,
I have the same code running on two platforms:
platform1: Quadro P4000 (8119 MB RAM)
platform2: Titan V (12033 MB RAM)
I’m using mini batches of size 400 on both platforms.

While in platform1 everything works fine (consumes 7295/8119 MB , 99% volatile memory usage)

But, in platform2 it runs in CUDA: Out of Memory Error while only consuming 1129/12033 MB, 0% volatile memory usage and stops.

Traceback (most recent call last):                                             │                                                                               
  File "run_meenet1.py", line 151, in <module>                                 │                                                                               
    criterion=criterion)                                                       │                                                                               
  File "/home/ubuntu/projectSSML/meenet/modules/helpers.py", line 90, in train_│                                                                               
batchwise                                                                      │                                                                               
    loss.backward()                                                            │                                                                               
  File "/home/ubuntu/.local/lib/python3.7/site-packages/torch/tensor.py", line │                                                                               
93, in backward                                                                │                                                                               
    torch.autograd.backward(self, gradient, retain_graph, create_graph)        │                                                                               
  File "/home/ubuntu/.local/lib/python3.7/site-packages/torch/autograd/__init__│                                                                               
.py", line 90, in backward                                                     │                                                                               
    allow_unreachable=True)  # allow_unreachable flag                          │                                                                               
RuntimeError: CUDA error: out of memory           

The scripts are exactly the same.

What might be going wrong.
Thanks