Latest quantization topics

Topic	Replies	Views	Activity
About the quantization category	0	2228	October 2, 2019
How can we export the model quantized by `PyTorch 2 Export Quantization` to the binary file?	2	50	April 23, 2024
Histogram Calibration taking incredibly long time	3	45	April 23, 2024
Can an int8 model derived from pytorch's QAT training be converted directly to tensorRT?	2	47	April 18, 2024
Quantization - RuntimeError: apply_dynamic is not implemented for this packed parameter type	2	81	April 15, 2024
FX mode static_quantization for YOLOv7	7	200	April 10, 2024
Could not run 'aten::quantize_per_tensor.tensor_qparams' with arguments from the 'QuantizedCPU'	0	47	April 9, 2024
Fixed scale and zero point with FixedQParamsObserver	0	61	April 5, 2024
Network pruning error	15	1188	April 5, 2024
Search and modify layer/module outputs by name	0	61	April 3, 2024
Question on skipping quantization on unsupported modules	10	1590	April 3, 2024
Error during QAT training of ResNet50	4	70	April 2, 2024
Error with static quantization	2	169	April 2, 2024
I saved the quantized weight and loaded it with the model after torch.ao.quantization.convert(). how do I print the output of each layer of the network?	5	273	April 2, 2024
Is pytorch simulating the quantization?	1	149	April 2, 2024
Quantization Bug in Concatenation of Tensor	1	125	April 2, 2024
Roadmap for torch.ao?	2	117	April 2, 2024
Variable-bit (sub 8-bits) quantization for custom hardware deployment with power-of-two (pot) scales	9	1005	April 2, 2024
Question about quint8 and qint8	1	83	March 29, 2024
Do I really need two separate model definition for a quantized and an "unquantized" model?	2	101	March 28, 2024
QAT specific layers of a model	1	72	March 28, 2024
Quantizing model I'm hitting createStatus == pytorch_qnnp_status_success INTERNAL ASSERT FAILED	1	88	March 28, 2024
RuntimeError in torch.quantization.convert after QAT on GPU	2	87	March 28, 2024
Dequantize tensors from int8 to fp16	3	127	March 28, 2024
Run quantized model on GPU	1	115	March 25, 2024
Graph tracing false when meeting tensor slicing operation	6	193	March 25, 2024
For 4bit quantization	2	616	March 22, 2024
Quantized onnx model run slower	1	87	March 22, 2024
Implementing Quantized Linear Layer in Numpy	1	154	March 21, 2024
How to inference with smoothquant quantized model with pytorch?	6	603	March 20, 2024