Calculating loss on multi GPU

Hi, I want to know how to do some basic calculation, such as calculating the loss on multi-GPU.

For example, Calculating the Cosine Similarity between different samples in a large batch or in a memory bank.