Latest quantization topics

Topic	Replies	Views	Activity
About the quantization category	0	2505	October 2, 2019
Why is there such a significant difference between floating-point convolution and quantized integer convolution results?	0	3	June 19, 2025
RuntimeError: quantized::conv2d_prepack() is missing value for argument 'stride'	0	7	June 12, 2025
[MPS] When device='mps', aten.linear.default op is not decomposed	1	19	June 5, 2025
[pt2e][quant] Quantization of operators with multiple outputs (RNN, LSTM)	3	150	May 1, 2025
Logits mismatch between PyTorch inference and manual implementation	1	52	April 29, 2025
QAT model drops accuracy after converting with torch.ao.quantization.convert	1	32	April 29, 2025
Qint8 Activations in PyTorch	1	99	April 25, 2025
How to do qat after ptq in PyTorch2 quantization?	1	69	April 25, 2025
Switch loss function causes "RuntimeError: Found dtype Double but expected Float"	6	1743	April 24, 2025
How to quantize my torchscript to fp8	1	105	April 18, 2025
Loss stuck at quantization aware training for 16bits	1	35	April 18, 2025
Question about quantized model save & load	5	118	April 18, 2025
Quantization method diff between fake quant and true quant	1	30	April 14, 2025
Right way to insert QuantStub and DeQuantStub in eager mode quantization	6	74	April 12, 2025
QAT model is not performing as expected when compared to the original model	7	78	April 9, 2025
Can't get dynamic shape with torch.export.export_for_training	3	179	April 8, 2025
The code aims to collect data about SiLU (Sigmoid Linear Unit) activation layers in a quantized YOLOv5 model. Specifically, it: Creates a custom SiLUDataCollector to replace SiLU layers Captures quantization parameters (scale and zero point) Saves quanti	1	43	April 6, 2025
How to custom a quantizar using fx	1	31	April 6, 2025
Question on quantize_per_channel and dequantize	5	95	April 6, 2025
Quantized LLM inference vs quantized matrix multiplication speed in CPU	2	80	April 6, 2025
Quantization fails for custom backend	3	212	March 21, 2025
Simulating quantization to lower bit precision with quant_min/max setting on fused modules	0	31	March 19, 2025
Help Needed: High Inference Time & CPU Usage in VGG19 QAT model vs. Baseline	0	32	March 15, 2025
Is dynamic quantization in fact doing weight dequant instead of activation quant for `quantize_dynamic()`	1	73	March 13, 2025
Trying to Understand the Scale Computation During Static Quantisation	0	41	March 2, 2025
Exclude specific layers from quantization	1	93	February 24, 2025
Questions on QAT for Wav2Vec	3	237	February 24, 2025
"Deploy Quantized Models using Torch-TensorRT" failed	5	134	February 18, 2025
Additional layer in the conv weight after quantization	1	259	February 15, 2025