Hey,
I have a torch script that do torch.prod on tensor in shape: (1000, 400, 400, 144).
I am running this torchscript on Nvidia H100 GPU.
I am trying to find a way to use fp8 quantization.
How to do it while creating the model?
Thanks!
if you don’t need to use torchscript, you can try this: ao/torchao/quantization at main · pytorch/ao · GitHub