Inserting Unnecessary Fake Quants during QAT?
|
|
2
|
253
|
November 12, 2024
|
Torch.bfloat16 < how does it work? in bf 16 model
|
|
1
|
264
|
November 4, 2024
|
pytorch quantized linear function gives shape invalid error
|
|
3
|
225
|
November 1, 2024
|
How to lower to target backend?
|
|
3
|
294
|
November 1, 2024
|
Questions about build customized quantizer
|
|
1
|
371
|
October 23, 2024
|
Documentation about the Post Training Quantization not clear
|
|
6
|
171
|
October 23, 2024
|
Quantized model and Tensorrt deployment problem
|
|
1
|
57
|
October 22, 2024
|
Significant Accuracy Drop After "Custom" Activation Quantization – Seeking Debugging Suggestions
|
|
1
|
59
|
October 19, 2024
|
Is this for only Linux? ImportError: cannot import name 'quantize_' from 'torchao.quantization'
|
|
2
|
1330
|
October 17, 2024
|
Usage of tensor attributes in FX quantization
|
|
1
|
166
|
October 17, 2024
|
Isn't Bias normally int Quantized in INT8 PTSQ model?
|
|
1
|
121
|
October 9, 2024
|
Inference accuracy mismatch between original, quantized, dequantized model
|
|
2
|
217
|
September 18, 2024
|
Fixed scale and zero point with FixedQParamsObserver
|
|
2
|
555
|
September 12, 2024
|
Error in running quantised model RuntimeError: Could not run 'quantized::conv2d.new' with arguments from the 'CPU' backend
|
|
6
|
4219
|
August 29, 2024
|
Using AMP with QAT
|
|
1
|
584
|
August 27, 2024
|
Is bias quantized when I run pt2e quantization?
|
|
12
|
195
|
August 21, 2024
|
The results of torch.profiler() and time.time() do not match
|
|
8
|
1215
|
August 20, 2024
|
Reproduce qconv kernel for x86
|
|
4
|
180
|
August 19, 2024
|
Torch.jit.script does not work on a quantized model
|
|
6
|
328
|
August 14, 2024
|
Confusion Regarding Quantization on GPUs with PyTorch
|
|
1
|
198
|
August 13, 2024
|
Is fuse_fx supposed to preceed convert_fx in the quant pipeline?
|
|
1
|
134
|
August 9, 2024
|
How to adjust the model to eliminate errors in convert_fx()?
|
|
10
|
223
|
July 18, 2024
|
Relationship between GPU Memory Usage and Batch Size
|
|
8
|
9064
|
July 17, 2024
|
Question about QAT quantization with torch.fx
|
|
7
|
524
|
July 16, 2024
|
Fusing a QAT model post-training
|
|
4
|
327
|
July 2, 2024
|
Relative error greater than unit roundoff for torch.float16
|
|
1
|
121
|
July 2, 2024
|
QuantStub with values in [-128,127]
|
|
6
|
761
|
July 2, 2024
|
TypeError: quantized_add() missing 2 required positional arguments: 'op_scale' and 'op_zero_point'
|
|
8
|
1134
|
July 2, 2024
|
Pytorch quantized model to ONNX - quantized_decomposed::quantize_per_tensor Error
|
|
3
|
474
|
July 2, 2024
|
After the neural network is quantized, how to use the GPU to infer the model?
|
|
1
|
240
|
June 28, 2024
|