I heard the Pytorch team is adding the flash-attention for Transformer

JonathanSum · November 20, 2022, 2:25pm

If so, when are we going to have it?
I have to ask the team because flash-attention is not a model implementation. It is the implementation of matrix multiplication.