How tokens per second calculated for LLM training

how are tokens per second calculated for LLM training? Is this the same way as for LLM inference?