|
Right way to insert QuantStub and DeQuantStub in eager mode quantization
|
|
6
|
218
|
April 12, 2025
|
|
QAT model is not performing as expected when compared to the original model
|
|
7
|
204
|
April 9, 2025
|
|
Can't get dynamic shape with torch.export.export_for_training
|
|
3
|
379
|
April 8, 2025
|
|
The code aims to collect data about SiLU (Sigmoid Linear Unit) activation layers in a quantized YOLOv5 model. Specifically, it: Creates a custom SiLUDataCollector to replace SiLU layers Captures quantization parameters (scale and zero point) Saves quanti
|
|
1
|
100
|
April 6, 2025
|
|
How to custom a quantizar using fx
|
|
1
|
66
|
April 6, 2025
|
|
Question on quantize_per_channel and dequantize
|
|
5
|
204
|
April 6, 2025
|
|
Quantized LLM inference vs quantized matrix multiplication speed in CPU
|
|
2
|
142
|
April 6, 2025
|
|
Quantization fails for custom backend
|
|
3
|
282
|
March 21, 2025
|
|
Simulating quantization to lower bit precision with quant_min/max setting on fused modules
|
|
0
|
81
|
March 19, 2025
|
|
Help Needed: High Inference Time & CPU Usage in VGG19 QAT model vs. Baseline
|
|
0
|
67
|
March 15, 2025
|
|
Is dynamic quantization in fact doing weight dequant instead of activation quant for `quantize_dynamic()`
|
|
1
|
148
|
March 13, 2025
|
|
Trying to Understand the Scale Computation During Static Quantisation
|
|
0
|
86
|
March 2, 2025
|
|
Exclude specific layers from quantization
|
|
1
|
162
|
February 24, 2025
|
|
Questions on QAT for Wav2Vec
|
|
3
|
306
|
February 24, 2025
|
|
"Deploy Quantized Models using Torch-TensorRT" failed
|
|
5
|
352
|
February 18, 2025
|
|
Additional layer in the conv weight after quantization
|
|
1
|
301
|
February 15, 2025
|
|
Compatibility Issue: Wav2Vec2 QAT with PyTorch 2 Export
|
|
1
|
207
|
February 13, 2025
|
|
Quantized::linear (xnnpack): xnn create operator failed(2)
|
|
1
|
193
|
February 12, 2025
|
|
JIT model is a deployment model or a quantized model?
|
|
0
|
73
|
February 7, 2025
|
|
How to customize a quantization algorithm and deploy it?
|
|
2
|
115
|
February 5, 2025
|
|
USing Quantization tutorial,but the result different
|
|
2
|
102
|
February 4, 2025
|
|
Data types on quantized models
|
|
0
|
158
|
February 4, 2025
|
|
Custom weight observer for powers of 2
|
|
2
|
869
|
January 29, 2025
|
|
Run quantized model on GPU
|
|
2
|
2388
|
January 23, 2025
|
|
Quantization of depthwise 1d convolution with QAT is slower than non-quantized
|
|
2
|
266
|
January 23, 2025
|
|
Taylor-series Approximation for Sigmiod in Integer
|
|
1
|
282
|
January 15, 2025
|
|
Triton kernel to efficiently dequantize int4
|
|
0
|
189
|
January 5, 2025
|
|
BatchNorm not fusing with Cone and ReLU
|
|
0
|
74
|
December 26, 2024
|
|
Compile Model with TensorRT
|
|
0
|
129
|
December 25, 2024
|
|
How to convert a QAT model to ONNX model
|
|
3
|
521
|
December 19, 2024
|