Why `torch.max` has expected device "cpu"?

I’ve two tensors namely pred_boxes and gt_boxes and both are on the ‘cuda’. When I pass these tensors to box_iou op from the torchvision, I get the following error:

/opt/conda/lib/python3.7/site-packages/torchvision/ops/boxes.py in box_iou(boxes1, boxes2)
    160     area2 = box_area(boxes2)
    161 
--> 162     lt = torch.max(boxes1[:, None, :2], boxes2[:, :2])  # [N,M,2]
    163     rb = torch.min(boxes1[:, None, 2:], boxes2[:, 2:])  # [N,M,2]
    164 

RuntimeError: expected device cpu but got device cuda:0

So I tried calling the torch.max for the bounding box tensors to verify the same

torch.max(gt_boxes[:, None, :2], pred_boxes[:, :2])

Which indeed is causing this issue

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-42-2742a63bf57b> in <module>
----> 1 torch.max(gt_boxes[:, None, :2], pred_boxes[:, :2])

RuntimeError: expected device cpu but got device cuda:0

I want to know why do we need the tensors to be on the cpu make use of max operation?

Hi Kshitij!

Please double check this. I can’t reproduce this (on an older version
of pytorch).

Could you add

print (torch.max)   # to make sure you didn't overwrite torch.max
print (gt_boxes.device, pred_boxes.device)   # verify tensor locations

before your call to torch.max()?

Best.

K. Frank