What is the mainstream approach to sentence embedding?

Is it to use the [CLS] token, or just to take the mean of the output embeddings?