About the quantization category
|
|
0
|
2219
|
October 2, 2019
|
Histogram Calibration taking incredibly long time
|
|
0
|
2
|
April 19, 2024
|
Can an int8 model derived from pytorch's QAT training be converted directly to tensorRT?
|
|
2
|
40
|
April 18, 2024
|
How can we export the model quantized by `PyTorch 2 Export Quantization` to the binary file?
|
|
1
|
38
|
April 15, 2024
|
Quantization - RuntimeError: apply_dynamic is not implemented for this packed parameter type
|
|
2
|
70
|
April 15, 2024
|
FX mode static_quantization for YOLOv7
|
|
7
|
189
|
April 10, 2024
|
Could not run 'aten::quantize_per_tensor.tensor_qparams' with arguments from the 'QuantizedCPU'
|
|
0
|
40
|
April 9, 2024
|
Fixed scale and zero point with FixedQParamsObserver
|
|
0
|
48
|
April 5, 2024
|
Network pruning error
|
|
15
|
1179
|
April 5, 2024
|
Search and modify layer/module outputs by name
|
|
0
|
53
|
April 3, 2024
|
Question on skipping quantization on unsupported modules
|
|
10
|
1580
|
April 3, 2024
|
Error during QAT training of ResNet50
|
|
4
|
62
|
April 2, 2024
|
Error with static quantization
|
|
2
|
161
|
April 2, 2024
|
I saved the quantized weight and loaded it with the model after torch.ao.quantization.convert(). how do I print the output of each layer of the network?
|
|
5
|
269
|
April 2, 2024
|
Is pytorch simulating the quantization?
|
|
1
|
141
|
April 2, 2024
|
Quantization Bug in Concatenation of Tensor
|
|
1
|
114
|
April 2, 2024
|
Roadmap for torch.ao?
|
|
2
|
109
|
April 2, 2024
|
Variable-bit (sub 8-bits) quantization for custom hardware deployment with power-of-two (pot) scales
|
|
9
|
995
|
April 2, 2024
|
Question about quint8 and qint8
|
|
1
|
75
|
March 29, 2024
|
Do I really need two separate model definition for a quantized and an "unquantized" model?
|
|
2
|
93
|
March 28, 2024
|
QAT specific layers of a model
|
|
1
|
66
|
March 28, 2024
|
Quantizing model I'm hitting createStatus == pytorch_qnnp_status_success INTERNAL ASSERT FAILED
|
|
1
|
80
|
March 28, 2024
|
RuntimeError in torch.quantization.convert after QAT on GPU
|
|
2
|
80
|
March 28, 2024
|
Dequantize tensors from int8 to fp16
|
|
3
|
115
|
March 28, 2024
|
Run quantized model on GPU
|
|
1
|
102
|
March 25, 2024
|
Graph tracing false when meeting tensor slicing operation
|
|
6
|
183
|
March 25, 2024
|
For 4bit quantization
|
|
2
|
610
|
March 22, 2024
|
Quantized onnx model run slower
|
|
1
|
77
|
March 22, 2024
|
Implementing Quantized Linear Layer in Numpy
|
|
1
|
147
|
March 21, 2024
|
How to inference with smoothquant quantized model with pytorch?
|
|
6
|
588
|
March 20, 2024
|