Error in F.scaled_dot_product_attention

When i’m using F.scaled_dot_product_attention with SDPBackend.CUDNN_ATTENTION and with float attn mask (some values in mask is -inf) i got a lot of nans in output tensor.If i use SDPBackend.EFFICIENT_ATTENTION or SDPBackend.MATH all is okey, if try SDPBackend.FLASH_ATTENTION got
RuntimeError: No available kernel. Aborting execution