Backward pass of scaled_dot_product_attention fails on H100

Yongfei_Yan · July 18, 2023, 2:16pm

I came into the same problem when using loss.backward(), when I swithed torch to Preview(Nightly) version, it worked.