What is the principle foundation of nn.CosineEmbeddingLoss?

I happened to find a loss function nn.CosineEmbeddingLoss, which I found the idea is quite similar to contrastive loss used in siamese networks. Is this loss a better and more stable version of contrastive loss, or is there a paper proposing this loss ?