A gpu out of memory error occurred while moving the tensor to the CPU

During the calculation of proposal and target iou in detectron2 framework, if OOM appeared, it would try to transfer it to CPU for calculation, but the error of gpu out of memory still occurred when moving to CPU

Traceback (most recent call last):
  File "main.py", line 76, in <module>
    loss_dict = model(data)
  File "/root/anaconda3/envs/python367/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/detectron2/detectron2/modeling/meta_arch/rcnn.py", line 124, in forward
    proposals, proposal_losses = self.proposal_generator(images, features, gt_instances)
  File "/root/anaconda3/envs/python367/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/yaogan/custom/kaggle_rpn.py", line 114, in forward
    losses = outputs.losses()
  File "/root/detectron2/detectron2/modeling/proposal_generator/rpn_outputs.py", line 333, in losses
    gt_objectness_logits, gt_anchor_deltas = self._get_ground_truth()
  File "/root/yaogan/custom/kaggle_rpn.py", line 57, in _get_ground_truth
    matched_idxs, gt_objectness_logits_i = retry_if_cuda_oom(self.anchor_matcher)(match_quality_matrix)
  File "/root/detectron2/detectron2/utils/memory.py", line 84, in wrapped
    return func(*new_args, **new_kwargs)
  File "/root/detectron2/detectron2/utils/memory.py", line 82, in <genexpr>
    new_args = (maybe_to_cpu(x) for x in args)
  File "/root/detectron2/detectron2/utils/memory.py", line 65, in maybe_to_cpu
    return x.to(device="cpu")
RuntimeError: CUDA out of memory. Tried to allocate 2.27 GiB (GPU 0; 10.76 GiB total capacity; 7.76 GiB already allocated; 334.69 MiB free; 9.60 GiB reserved in total by PyTorch)

Why ?