Correct way to calculate FLOPS in model

FLOP count is a property of an algorithm rather than a model. Does Linear layer have 2mqp or mq(2p-1) FLOPs? Depends how matmul is performed – see discussion here. You can get an approximate count by assuming some reference implementation.

nn.Embedding is a dictionary lookup, so technically it has 0 FLOPS.

Since FLOP count is going to be approximate anyway, you only care about the heaviest to compute layers. You could profile your model and see if there are any expensive layers not covered already. TensorFlow has some reference formulas here