Hi All,

We’ve seen many research papers comparing **FLOPS** of different models. I just want to understand clearly how the hand calculation is done by convention.

For example, I’m multiplying a **MxN** matrix with a **Nx1** vector.

Each matrix row takes dot product with the vector to produce a single scalar value in final output, which costs N multiplications and N-1 additions.

Repeating for all M matrix rows, we have MxN multiplications and Mx(N-1) additions.

In this example, do we count the total **FLOPS** as Mx(2N-1) (additions & multiplications), or MxN (multiply-add operations) ?

Thank you very much !