What is the best practices of logging in distributed training?

Did some googling and found very few discussions on this matter. Is it best to perform all-reduce on, say, loss values, and keep track of it within the process with rank 0, like what the official tutorial recommends for checkpoints?