How to quantize my torchscript to fp8

mikeybydun1 · April 12, 2025, 10:36pm

Hey,
I have a torch script that do torch.prod on tensor in shape: (1000, 400, 400, 144).
I am running this torchscript on Nvidia H100 GPU.
I am trying to find a way to use fp8 quantization.
How to do it while creating the model?
Thanks!

jerryzh168 · April 18, 2025, 3:22am

if you don’t need to use torchscript, you can try this: ao/torchao/quantization at main · pytorch/ao · GitHub