How to figure out which implementation is used by scaled_dot_product_attention?

Is there any way to find out which implementation is used by scaled_dot_product_attention? Even better, if there is any information on why given implementation was chosen, for example, if number of heads are not multiple of 8.