RuntimeError: std::bad_alloc: temporary_buffer::allocate: get_temporary_buffer failed in RPN

Here is the error message:

Traceback (most recent call last):
  File "/home/shiqian/zhangzh/FAFRCNN/", line 28, in <module>
  File "/home/shiqian/zhangzh/FAFRCNN/", line 24, in train
  File "/home/shiqian/zhangzh/FAFRCNN/", line 82, in transfer_train_fast
    adv_meters, meters, ins_adv_optimizer, img_adv_optimizer, train_num=cfg.dis_train_num)
  File "/home/shiqian/zhangzh/FAFRCNN/", line 61, in dis_train
    faster_rcnn_mp.rpn_features(t_images, t_features, t_targets)
  File "/home/shiqian/zhangzh/FAFRCNN/model/faster_rcnn/", line 269, in rpn_features
    proposals, proposal_losses, iou, anchor_len = self.rpn(images, features, targets)
  File "/home/shiqian/.local/lib/python3.5/site-packages/torch/nn/modules/", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/shiqian/zhangzh/FAFRCNN/model/faster_rcnn/", line 344, in forward,, labels, regression_targets)
  File "/home/shiqian/.local/lib/python3.5/site-packages/torchvision/models/detection/", line 371, in compute_loss
  File "/home/shiqian/.local/lib/python3.5/site-packages/torch/nn/", line 2179, in l1_loss
    ret = torch._C._nn.l1_loss(expanded_input, expanded_target, _Reduction.get_enum(reduction))
RuntimeError: std::bad_alloc: temporary_buffer::allocate: get_temporary_buffer failed

I encountered this error many times. It always happens at RPN’s loss computation. Sometimes I can train my model for many epochs without this error, but sometimes it occurred at the second epoch. I am training my model on a server, which should have enough memory. This is the output of free -h in bash:

              total        used        free      shared  buff/cache   available
Mem:           376G        103G         79G         10G        193G        259G
Swap:          7.6G        5.4G        2.2G

I modified the FasterRCNN in pytorch to enable model parallel manually, other components are basically the same.