Why is model divided onto two device?

I used pytorch to train a faster-rcnn model. After converting it into torchscript model, I loaded the model in c++ on GPU. but via the qt debug table, I find that the model is divided onto cup and cuda:0. so can any one tell me why it is and how to fix it? Thanks very much!

I have loaded the TorchScript model on cuda:0 and I use cmd print(model.forward.code) to print the forward function. I found that there are still some variables on cpu. so how can I deal with it?

def forward(self,
argument_1: Tensor) → Tuple[Tensor, Tensor, Tensor]:
_0 = self.model
_1 = _0.roi_heads
_2 = _0.rpn
_3 = _0.backbone
img = torch.select(argument_1, 0, 0)
s = ops.prim.NumToTensor(torch.size(img, 1))
s0 = ops.prim.NumToTensor(torch.size(img, 2))
image = torch.select(argument_1, 0, 0)
mean = torch.to(CONSTANTS.c0, torch.device(“cpu”), 6, False, False, None)
std = torch.to(CONSTANTS.c1, torch.device(“cpu”), 6, False, False, None)
_4 = torch.slice(mean, 0, 0, 9223372036854775807, 1)
_5 = torch.unsqueeze(torch.unsqueeze(_4, 1), 2)
_6 = torch.sub(image, _5, alpha=1)
_7 = torch.slice(std, 0, 0, 9223372036854775807, 1)
_8 = torch.unsqueeze(torch.unsqueeze(_7, 1), 2)