I got the pretrained FASTERRCNN_RESNET50_FPN model from pytorch (torchvision).
Now I want to compute the model’s complexity (number of parameters and FLOPs) as reported from torchvsion:
How to do this? Normally with classification model (e.g. resnet50), we can use tools such as thop or ptflop. But the main concern is: What is the correct input image size (width & height, channel=3 for sure)? From my reading, FasterCNN accepts unfixed input image size, but I’ve not found the step where the image is resized during forward. Personally, I think the image will be passed to the backbone firstly (which is resnet50), so I chose input image size = (224,224) (same as imagenet’s). But when trying this with ptflop, the output FLOPs is very unstable.
Any recommendation is appreciated! Thanks in advance!
The number of parameters is invariant with input size.
For the computational complexity (e.g, MACs), torchvision set default value for their models as follows: