Latest quantization topics

Topic	Replies	Views	Activity
Correctly changing precision in DLRM	1	349	February 20, 2024
Wav2vec2 quantization dimention error	4	561	February 20, 2024
Decrease in model parameters in dynamic quantization	3	395	February 13, 2024
Dynamic Quantization produces inconsistent outputs	1	515	February 13, 2024
Could not run 'quantized::conv2d.new'	2	1062	February 13, 2024
Quantizing to int8 without stubs for input and output?	6	1135	February 13, 2024
Why does modules fusion replace fused modules by nn.Identity?	1	350	February 13, 2024
How can I save a convert_pt2e model?	1	531	February 1, 2024
RuntimeError: promoteTypes with quantized numbers is not handled yet; figure out what the correct rules should be, offending types: QUInt8 Float	13	1002	February 1, 2024
Inference with own scaling factors	0	293	January 21, 2024
Question about QAT	1	342	January 19, 2024
Select the right observers in QAT	5	786	January 19, 2024
AttributeError: 'NoneType' object has no attribute 'dequantize'	10	1133	January 14, 2024
Could not run 'aten::_log_softmax.out' with arguments from the 'QuantizedCPU' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build)	2	889	January 13, 2024
Convert back to Unquantized model	14	2247	January 13, 2024
Extremely bad LSTM Static Quantization performance compared to Dynamic	3	1317	January 12, 2024
How to convert the quantized model to tensorrt for GPU inference	9	2481	January 11, 2024
Is there a way to perform inference on the QAT model using a GPU?	1	620	January 11, 2024
NotImplementedError: Could not run 'aten::_slow_conv2d_forward' with arguments from the 'QuantizedCPU' backend	4	2509	January 11, 2024
Resnet18 fx_qat to onnx	1	427	January 11, 2024
Significant Slowdown in Inference Speed with Quantized Model in PyTorch 2.1 pt2e	5	2423	January 7, 2024
How to extract the intermediate layers of vgg16 model	1	433	January 5, 2024
Unconverted GroupNorm with FX Graph Mode Quantization	9	655	December 18, 2023
BatchNorm and ConvTranspose Fusion for QAT with FX Graph Mode	3	703	December 18, 2023
Unchanged behaviour of using a pretrained Model for QAT	1	327	December 15, 2023
Does quantization in eager mode require inserting multiple different FFs?	1	352	December 15, 2023
In flatten the output scale is different from the input scale	1	323	December 15, 2023
ONNX export of quantized model	39	25371	December 7, 2023
Shall I remove the BN and ReLU in C progress?	1	418	December 5, 2023
Missing Histograms for LayerNorm in Numeric Suite Analysis	3	492	November 30, 2023