Specific implementation of the function scaled_dot_product_attention？

Du_ni_kai · December 6, 2024, 3:03am

https://pytorch.org/docs/stable/generated/torch.nn.functional.scaled_dot_product_attention.html#torch-nn-functional-scaled-dot-product-attention
The implementation code introduced in this document does not actually work the same as torch.nn.functional.scaled_dot_product_attention
I came to this conclusion through experimentation. My experimental method was to replace the code implementation described in the document with the place where this built-in function was called.