I am a student who is learning about Faster R-CNN family and trying to understand the implementation provided in the torchvision.model.detection
module. However, I find the code difficult to read and understand. Some blocks of code can be simplified more. Example:
# https://github.com/pytorch/vision/blob/2b25d67925df9741ba2a75a07bc3046302969e87/torchvision/models/detection/_utils.py#L162
box_sum = 0
for val in boxes_per_image:
box_sum += val
Therefore, I propose to improve the code’s readability to make it more accessible to beginners like myself. Here are some possible improvements that I could make:
- Add more comments and docstrings to explain the purpose and functionality of each function and module
- Simplify and reorganize complex code blocks.
I believe that these improvements will make the code more accessible to beginners and improve the overall quality of the module.
Please let me know if you have any feedback or suggestions. I would be happy to collaborate with other contributors to make these changes.
Thank you for your consideration.