What is the best loss function when targets are vector?

For a model that takes a word embedding vector as input and is supposed to predict another vector, what is the best loss function? Assume the training data contains a target embedding vector for each input.