I recently came upon torchvision.transforms.v2.functional.rotate_bounding_boxes (1#L1419) function. I did not find any documentation on the web, but the function does not seem to be internal either (at least from naming scheme). However, I found a peculiar issue.
import torch
from torchvision.transforms.v2.functional import rotate_bounding_boxes
from torchvision.tv_tensors import BoundingBoxes
data = torch.tensor([[0, 0, 10, 10], [30, 40, 50, 60]])
bboxes1 = BoundingBoxes(data, format="xyxy", canvas_size=(1000, 500))
bboxes2 = BoundingBoxes(data.float(), format="xyxy", canvas_size=(1000, 500))
rotated_bboxes1, tup = rotate_bounding_boxes(bboxes1, bboxes1.format, (1000, 500), angle=90)
print("successfully rotated bboxes1!")
rotated_bboxes2 = rotate_bounding_boxes(bboxes2, bboxes2.format, (500, 1000), angle=90)
print("successfully rotated bboxes2!")
Here is the output:
$ python foo.py
successfully rotated bboxes1!
Traceback (most recent call last):
File "/home/sayandip/github/vision/foo.py", line 13, in <module>
rotated_bboxes2 = rotate_bounding_boxes(bboxes2, bboxes2.format, (500, 1000), angle=90)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sayandip/github/vision/torchvision/transforms/v2/functional/_geometry.py", line 1428, in rotate_bounding_boxes
return _affine_bounding_boxes_with_expand(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sayandip/github/vision/torchvision/transforms/v2/functional/_geometry.py", line 1102, in _affine_bounding_boxes_with_expand
convert_bounding_box_format(bounding_boxes, old_format=format, new_format=intermediate_format, inplace=True)
File "/home/sayandip/github/vision/torchvision/transforms/v2/functional/_meta.py", line 336, in convert_bounding_box_format
raise ValueError("For bounding box tv_tensor inputs, `old_format` must not be passed.")
ValueError: For bounding box tv_tensor inputs, `old_format` must not be passed.
So when the bounding box data is floating point, there is this error ValueError: For bounding box tv_tensor inputs, `old_format` must not be passed.
I went digging, and found out two issues. So, the reason the error does not show up for non-floating-point tensors, is due to this line (1#L1095-1096):
need_cast = not bounding_boxes.is_floating_point()
bounding_boxes = bounding_boxes.float() if need_cast else bounding_boxes.clone()
<tensor_subclass>.float() unwraps the BoundingBoxes instance into pure tensor, so later this line (2#L330):
if torch.jit.is_scripting() or is_pure_tensor(inpt):
evaluates to True and the function exits successfully, whereas for floating point inputs, since the BoundingBoxes instance doesn’t get unwrapped, the above line evaluates to False, and hits this area (2#L334-L336):
elif isinstance(inpt, tv_tensors.BoundingBoxes):
if old_format is not None:
raise ValueError("For bounding box tv_tensor inputs, `old_format` must not be passed.")
And throws ValueError.
Now this can be “fixed”, and I am willing to do the work, but I would like to confirm from the folks and experts here, whether this is in fact a bug, and more importantly is this part of public API that people should depend on (it seems it is, but since it is undocumented, I am not totally convinced).
If both are affirmative, what is the next action? Shall I open an issue and a PR? I have looked at the contributing guide, but just wanted to reaffirm since I am new here. Thanks a lot!
System details
$ python --version
Python 3.12.3
$ uname -r
6.6.87.2-microsoft-standard-WSL2
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 24.04.3 LTS
Release: 24.04
Codename: noble
$ pip show torch torchvision
Name: torch
Version: 2.10.0.dev20251101+cpu
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page:
Author:
Author-email: PyTorch Team <packages@pytorch.org>
License: BSD-3-Clause
Location: /home/sayandip/github/vision/.venv/lib/python3.12/site-packages
Requires: filelock, fsspec, jinja2, networkx, setuptools, sympy, typing-extensions
Required-by: torchvision
---
Name: torchvision
Version: 0.25.0a0+cfbc5c2
Summary: image and video datasets and models for torch deep learning
Home-page: https://github.com/pytorch/vision
Author: PyTorch Core Team
Author-email: soumith@pytorch.org
License: BSD
Location: /home/sayandip/github/vision/.venv/lib/python3.12/site-packages
Editable project location: /home/sayandip/github/vision
Requires: numpy, pillow, torch
Required-by:
PS: Since I am a new user and limited to two links, I couldn’t link to every line in question with permalink.