Quantized onnx model run slower

minhhotboy9x · March 22, 2024, 2:58am

Hi, I’m trying to quantize a simple model with several conv2d layers. When I compared the quantized onnx model with the original model on cpu, the quantized model run slower.
Here is my code:

Can anyone help me with this, please? Thank you very much?

jerryzh168 · March 22, 2024, 10:44pm

this is asking about onnx perf? maybe open an issue in onnx repo? Issues · onnx/onnx · GitHub