When I look at the available models in torchvision Models and pre-trained weights — Torchvision 0.15 documentation , the relationship between Params and GFLOPS does not seem to be linearly related as I would expect. For example, `VGG11_BN_Weights.IMAGENET1K_V1`

which has 132.9M weights has a GLOPS of 7.61 while `EfficientNet_V2_S_Weights.IMAGENET1K_V1 `

has much fewer weights at 21.5M, while the GLOPS is around the same at 8.37, and consequently they take around the same time per epoch given the same inputs. I would expect since one has about 15% the number of parameters as the other, that it would take about 15% the amount of time per epoch given the same inputs but that is not the case. What is the relationship between number of params and GFLOPS? It looks like GFLOPS is the better measure to use when comparing how iteration time per epoch compares to another model.

GFLOPS (not to be confused with GFLOPs) describes how many floating point operations a model can perform *per second*. More GFLOPS implies the model has a faster throughput, all else being equal.