Hi ! I’m a newbie for quantizationing.I’ve met a problem during using quantization like below error output:
'quantized::embedding_byte' is only available for these backends: [CPU, Meta,
BackendSelect, Python, FuncTorchDynamicLayerBackMode,
Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther,
AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU,
AutogradLazy, AutogradMeta, Tracer, AutocastCPU, AutocastCUDA,
FuncTorchBatched, BatchedNestedTensor,
FuncTorchVmapMode, Batched, VmapMode,
FuncTorchGradWrapper, PythonTLSSnapshot,
FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].
When i fall backing into CPU,it raised another error:
RuntimeError: quantized::linear_dynamic() Expected a value of type 'Tensor' for argument 'X' but instead found type 'method'.
Position: 0
Here is the quantization code i write for:
# Applying Dynamic Quantization to the model
for _, mod in model_fp32.named_modules():
if isinstance(mod, torch.nn.Embedding):
mod.qconfig = torch.ao.quantization.float_qparams_weight_only_qconfig
model_qint8 = torch.ao.quantization.quantize_dynamic(
model_fp32,
{
torch.nn.Embedding,
torch.nn.Linear
},
dtype=torch.qint8
)
Is there any i messed up or something,please help me.