Hi,
I don’t look at pytorch’s examples so I cannot respond to that.
FLOPs are Floating-point operations but that’s a bit ambiguous as described
When dealing about computing effort and computing speed (hardware performance), terminology
is usually confusing. For instance, the term ‘compute’ is used ambiguously, sometimes applied to
the number of operations or the number of operations per second. However, it is important to clarify
what kind of operations and the acronyms for them. In this regard, we will use the acronym FLOPS
to measure hardware performance, by referring to the number of floating point operations per second, as standardised in the industry, while FLOPs will be applied to the amount of computation for
a given task (e.g., a prediction or inference pass), by referring to the number of operations, counting
a multiply-add operation pair as two operations. An extended discussion about this can be found in
the appendix.
In deep learning these floating point ops are usually just multiply-add ops. As you can read, it requires certain clarification but the general idea is to provide hardware-agnostic ways of showing computational cost.
For example, some papers show inference time, but this number obviously depends on the hardware and the batch size (so that the gpu is working at its maximum/optimal workload). Similarly, you can say how many imgs (in computer visioon) per unit of time you can process. Yet again, this depends on image size, batch size, hardware…
So in the end the most agnostic way (although the hardest) of showing this info is with the required amount of operations. If I my algorithm can sum 2 numbers in 5 steps and yours in 7, mine is better.
In your case, the simplest case would be doing your own tests. Also note that compiled models can benefit from improvements (there is torch.compile now). For example transformers now have flask attention in pytorch and so on…
My advice is:
Just take the metric that fits better your problem (inference time, imgs/second) and read a bit how to make a fair comparison.