|
Correctly changing precision in DLRM
|
|
1
|
349
|
February 20, 2024
|
|
Wav2vec2 quantization dimention error
|
|
4
|
561
|
February 20, 2024
|
|
Decrease in model parameters in dynamic quantization
|
|
3
|
395
|
February 13, 2024
|
|
Dynamic Quantization produces inconsistent outputs
|
|
1
|
515
|
February 13, 2024
|
|
Could not run 'quantized::conv2d.new'
|
|
2
|
1062
|
February 13, 2024
|
|
Quantizing to int8 without stubs for input and output?
|
|
6
|
1135
|
February 13, 2024
|
|
Why does modules fusion replace fused modules by nn.Identity?
|
|
1
|
350
|
February 13, 2024
|
|
How can I save a convert_pt2e model?
|
|
1
|
531
|
February 1, 2024
|
|
RuntimeError: promoteTypes with quantized numbers is not handled yet; figure out what the correct rules should be, offending types: QUInt8 Float
|
|
13
|
1002
|
February 1, 2024
|
|
Inference with own scaling factors
|
|
0
|
293
|
January 21, 2024
|
|
Question about QAT
|
|
1
|
342
|
January 19, 2024
|
|
Select the right observers in QAT
|
|
5
|
786
|
January 19, 2024
|
|
AttributeError: 'NoneType' object has no attribute 'dequantize'
|
|
10
|
1133
|
January 14, 2024
|
|
Could not run 'aten::_log_softmax.out' with arguments from the 'QuantizedCPU' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build)
|
|
2
|
889
|
January 13, 2024
|
|
Convert back to Unquantized model
|
|
14
|
2247
|
January 13, 2024
|
|
Extremely bad LSTM Static Quantization performance compared to Dynamic
|
|
3
|
1317
|
January 12, 2024
|
|
How to convert the quantized model to tensorrt for GPU inference
|
|
9
|
2481
|
January 11, 2024
|
|
Is there a way to perform inference on the QAT model using a GPU?
|
|
1
|
620
|
January 11, 2024
|
|
NotImplementedError: Could not run 'aten::_slow_conv2d_forward' with arguments from the 'QuantizedCPU' backend
|
|
4
|
2509
|
January 11, 2024
|
|
Resnet18 fx_qat to onnx
|
|
1
|
427
|
January 11, 2024
|
|
Significant Slowdown in Inference Speed with Quantized Model in PyTorch 2.1 pt2e
|
|
5
|
2423
|
January 7, 2024
|
|
How to extract the intermediate layers of vgg16 model
|
|
1
|
433
|
January 5, 2024
|
|
Unconverted GroupNorm with FX Graph Mode Quantization
|
|
9
|
655
|
December 18, 2023
|
|
BatchNorm and ConvTranspose Fusion for QAT with FX Graph Mode
|
|
3
|
703
|
December 18, 2023
|
|
Unchanged behaviour of using a pretrained Model for QAT
|
|
1
|
327
|
December 15, 2023
|
|
Does quantization in eager mode require inserting multiple different FFs?
|
|
1
|
352
|
December 15, 2023
|
|
In flatten the output scale is different from the input scale
|
|
1
|
323
|
December 15, 2023
|
|
ONNX export of quantized model
|
|
39
|
25371
|
December 7, 2023
|
|
Shall I remove the BN and ReLU in C progress?
|
|
1
|
418
|
December 5, 2023
|
|
Missing Histograms for LayerNorm in Numeric Suite Analysis
|
|
3
|
492
|
November 30, 2023
|