Quantized onnx model run slower

Hi, I’m trying to quantize a simple model with several conv2d layers. When I compared the quantized onnx model with the original model on cpu, the quantized model run slower.
Here is my code:

Can anyone help me with this, please? Thank you very much?

this is asking about onnx perf? maybe open an issue in onnx repo? Issues · onnx/onnx · GitHub

1 Like