How to customize a quantization algorithm and deploy it?
|
|
2
|
45
|
February 5, 2025
|
USing Quantization tutorial,but the result different
|
|
2
|
38
|
February 4, 2025
|
Data types on quantized models
|
|
0
|
90
|
February 4, 2025
|
Custom weight observer for powers of 2
|
|
2
|
740
|
January 29, 2025
|
Run quantized model on GPU
|
|
2
|
1874
|
January 23, 2025
|
Quantization of depthwise 1d convolution with QAT is slower than non-quantized
|
|
2
|
147
|
January 23, 2025
|
Taylor-series Approximation for Sigmiod in Integer
|
|
1
|
126
|
January 15, 2025
|
Triton kernel to efficiently dequantize int4
|
|
0
|
136
|
January 5, 2025
|
BatchNorm not fusing with Cone and ReLU
|
|
0
|
25
|
December 26, 2024
|
Compile Model with TensorRT
|
|
0
|
72
|
December 25, 2024
|
How to convert a QAT model to ONNX model
|
|
3
|
239
|
December 19, 2024
|
Pytorch 2 Export QAT is training
|
|
0
|
123
|
December 19, 2024
|
Quantized GLU not implemented?
|
|
1
|
137
|
December 17, 2024
|
Kernel Dies When Testing a Quantized ResNet101 Model in PyTorch
|
|
2
|
46
|
December 12, 2024
|
Auto-cast and pytorch 2 export quantization
|
|
8
|
384
|
December 9, 2024
|
RuntimeError: Quantized cudnn conv2d is currenty limited to groups = 1; received groups =16 , during QAT
|
|
3
|
960
|
December 6, 2024
|
Support for quantization in int16
|
|
5
|
158
|
December 5, 2024
|
Quantize a single tensor obtained from a float32 model
|
|
2
|
59
|
November 29, 2024
|
Simple quantisation reproduction - how to convert state dict to int8
|
|
1
|
64
|
November 27, 2024
|
torch.ao.nn.quantizable.modules.activation.MultiheadAttention not loading the pre-trained model weights correctly
|
|
1
|
47
|
November 27, 2024
|
QConfig for Resnet50 with weights dtype quint8
|
|
5
|
190
|
November 27, 2024
|
Load custom trained parameters into quantized model
|
|
1
|
46
|
November 27, 2024
|
Per channel setting for QAT Quantization
|
|
1
|
46
|
November 27, 2024
|
Custom QAT using ao.nn.qat modules, is this a valid approach?
|
|
1
|
55
|
November 27, 2024
|
Absence of qint32 in torch.ao.quantization.utils.weight_is_quantized
|
|
1
|
132
|
November 27, 2024
|
QAT QuantizedConv2d converted to ONNX format
|
|
1
|
145
|
November 27, 2024
|
Changing Qconfig to set datatype to int8
|
|
1
|
236
|
November 20, 2024
|
Inserting Unnecessary Fake Quants during QAT?
|
|
2
|
199
|
November 12, 2024
|
Torch.bfloat16 < how does it work? in bf 16 model
|
|
1
|
249
|
November 4, 2024
|
pytorch quantized linear function gives shape invalid error
|
|
3
|
183
|
November 1, 2024
|