Hi,
i’m working with executorch and torch.ao to enable quantization aware training in our workflow and possibly lowering a quantized implementation using executorch but no delegate. I wrote a Quantizer that can be configured to match multiple qscheme and i get a properly annotated graph than can be used in QAT. After pt2e conversion i get a model with inserted quantizer/dequantizer that uses the ATen operators (floating points) that i can convert to edge. PyTorch has a set of quantized ATen operators (pytorch/aten/src/ATen/native/quantized at main · pytorch/pytorch · GitHub) and i’d like to be able to use them to run the model inference. Looking through executorch code, there is a mention of replace_quantized_partition_with_op
in exir/backend/utils but i don’t see any use of it in the repository. Is there any way to replace dq/op/q partitions in the exir graph to use exir/ATen quantized operators without using a delegate ? In executorch/kernels/aten/functions.yaml
there is no ATen quantized operators registered and the quantized folder only registers a very limited set of operations (mainly quantize/dequantizers), is there any plan to have quantized operators supported in the runtime with no delegate ?