Hi, I’m trying to experiment and make tweaks and potential upgrades to FlashAttention, and wondering where’s the best place to start. Does the Pytorch integration copy-paste/pull from the original FlashAttention repo, or there are implementation changes made along with the integration?
Thanks!