Is pytorch SDPA using flash attention V2?

I wanted to know if Pytorch was using the V2 of flash attention here :slight_smile: torch.nn.functional.scaled_dot_product_attention — PyTorch master documentation

It is not said in the description of the function, only V1 is mentioned (link above), however it seems to be the case according to the blog :

So is Flash Attention V2 implemented or not ?

+1,I have same problem

I think the current torch version(2.1.0) only to support flash attention v1

Nightlies should have it and according to this post PyTorch 2.2 should be the target release.