Improving code readability of torchvision.model.detection

I am a student who is learning about Faster R-CNN family and trying to understand the implementation provided in the torchvision.model.detection module. However, I find the code difficult to read and understand. Some blocks of code can be simplified more. Example:

box_sum = 0
for val in boxes_per_image:
    box_sum += val

Therefore, I propose to improve the code’s readability to make it more accessible to beginners like myself. Here are some possible improvements that I could make:

  • Add more comments and docstrings to explain the purpose and functionality of each function and module
  • Simplify and reorganize complex code blocks.

I believe that these improvements will make the code more accessible to beginners and improve the overall quality of the module.

Please let me know if you have any feedback or suggestions. I would be happy to collaborate with other contributors to make these changes.

Thank you for your consideration.

Thanks for suggesting improvements to the code base!
I guess you could create a feature request on GitHub discussing potential changes and improvements, but let me also add @pmeier as the code owner so that we can chime in.

There is an issue now: Improving code readability of torchvision.models.detection · Issue #7458 · pytorch/vision · GitHub. Let’s keep the discussion there.

1 Like