Latest quantization topics

Topic	Replies	Views	Activity
Is this for only Linux? ImportError: cannot import name 'quantize_' from 'torchao.quantization'	2	1624	October 17, 2024
Usage of tensor attributes in FX quantization	1	223	October 17, 2024
Isn't Bias normally int Quantized in INT8 PTSQ model?	1	204	October 9, 2024
Inference accuracy mismatch between original, quantized, dequantized model	2	348	September 18, 2024
Fixed scale and zero point with FixedQParamsObserver	2	673	September 12, 2024
Error in running quantised model RuntimeError: Could not run 'quantized::conv2d.new' with arguments from the 'CPU' backend	6	4523	August 29, 2024
Using AMP with QAT	1	629	August 27, 2024
Is bias quantized when I run pt2e quantization?	12	429	August 21, 2024
The results of torch.profiler() and time.time() do not match	8	1382	August 20, 2024
Reproduce qconv kernel for x86	4	287	August 19, 2024
Torch.jit.script does not work on a quantized model	6	490	August 14, 2024
Confusion Regarding Quantization on GPUs with PyTorch	1	360	August 13, 2024
Is fuse_fx supposed to preceed convert_fx in the quant pipeline?	1	188	August 9, 2024
How to adjust the model to eliminate errors in convert_fx()?	10	354	July 18, 2024
Relationship between GPU Memory Usage and Batch Size	8	10211	July 17, 2024
Question about QAT quantization with torch.fx	7	676	July 16, 2024
Fusing a QAT model post-training	4	536	July 2, 2024
Relative error greater than unit roundoff for torch.float16	1	161	July 2, 2024
QuantStub with values in [-128,127]	6	888	July 2, 2024
TypeError: quantized_add() missing 2 required positional arguments: 'op_scale' and 'op_zero_point'	8	1285	July 2, 2024
Pytorch quantized model to ONNX - quantized_decomposed::quantize_per_tensor Error	3	651	July 2, 2024
After the neural network is quantized, how to use the GPU to infer the model?	1	296	June 28, 2024
Input data range after quantization	1	282	June 28, 2024
Quantizer Backend for Linear Op intermittent failures (ExecuTorch)	6	649	June 28, 2024
Random quantization	1	285	June 15, 2024
Why are `torch.bool`'s elements 1 byte and not 1 bit?	2	1260	June 8, 2024
Implementing Quantized Linear Layer in Numpy	2	854	June 8, 2024
Accessing input/output of unnamed functional layers via hooks	3	332	June 3, 2024
Inference error after int8 quantization with pytorch	12	3331	June 3, 2024
Qlinear (ONEDNN): data type of input should be QUint8	2	573	June 3, 2024