Support for quantization in int16

Hi there,

I see in the wiki about Quantized Tensors Introducing Quantized Tensor · pytorch/pytorch Wiki · GitHub, it mentions that there are plans to support int16 quantization. Although, that page was last edited in 2020.

Are there any updates on support for int16 quantization?

Thanks,
Miranda

no if you wanted to do that you’d normally just use bf16 or fp16 quantization. There are no plans to support it in pytorch at present.

Is there a way to do custom quantization so I can quantize to int16?

yeah please check out our new flow: Quantization — PyTorch main documentation

which backend do you want to deploy to?