BERT Embedding Vector

In applications like BERT, does the embedding capture the semantic meaning of the word , or does the embedding essentially learn a pseudo orthogonal friendly to the transformer it feeds?

Essentially the same question, in BERT like applications, is embedding equivalent to a reduced dimension orthogonal vector projected into a vector of dimension embedding_dim where the projection is learned?