What's Happening inside MultiHeadAttention Module in New Torch Version?

What sort of projections exactly happens inside Multi-Head-Attention Module ?
In Pytorch 1.9.1 All of Key,Query and Value needed to be of same dimentions(Embed Dimention) , Then Key,Query was mapped to Key Dimentions(called Kdim) and Value was mapped to Value Dimention(Called Vdim) before applying attention mechanism . And There was a mapping from Num HeadxValue dimention to Embed Dimention in the output . This was consistent with Transformer Paper .
In Pytorch 1.10 , This seems to have changed . Query , Key and Value each can be of separate dimentions now . What sort of projection and calculatons are happening inside the Block in new updated Torch ?