I heard the Pytorch team is adding the flash-attention for Transformer

If so, when are we going to have it?
I have to ask the team because flash-attention is not a model implementation. It is the implementation of matrix multiplication.
image

1 Like