Pytorch 2 FlashAttention and Memory Efficient Attention

BillyGun27 · March 17, 2023, 7:13am

I read that pytorch added memory-optimized algorithms like FlashAttention and Memory Efficient Attention

https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html?highlight=scaled_dot_product#torch.nn.functional.scaled_dot_product_attention

Does anyone know if pytorch will support Flash Attention or other memory-optimized algorithms in PyTorch Mobile later? maybe there will also be mobile GPU backend support compatibility?