There was a recent PR - Add SM89 support for f8f8bf16_rowwise() by alexsamardzic · Pull Request #144348 · pytorch/pytorch · GitHub
which “introduced support for _scaled_mm
operator with FP8 inputs on SM89 architecture. The support is based on CUTLASS library, that is header-only C++ library, so this new functionality gets fully built along with PyTorch build; however, it will get built only in case the build includes SM89 among targets.”
Unfortunately this requires that sm_89
is on the list of targets.
I just opened a ticket Support `sm_89` in Stable/Nightly/Docker Images · Issue #145632 · pytorch/pytorch · GitHub
Does this make sense?