Does anyone ever try to apply quantization for StreamingASR in android demo before?
Could someone provide a guidance on what to do?
I am not sure why this error occured, is the model just not supported yet?
I am trying to apply dynamic quantization for Emformer-RNNT-based Model
this is the quantization code that I use
model_pquant = torch.quantization.quantize_dynamic(
wrapper, # the original model
{torch.nn.LSTM, torch.nn.Linear}, # a set of layers to dynamically quantize
dtype=torch.qint8) # the target dtype for quantized weights
torch.save(model_pquant,'quant_dynamic_Emformer-RNNT_pure')
during inference, it produce this error
RuntimeError: In ChooseQuantizationParams, min should be less than or equal to max
after further analyzing, the error is caused because the input tensor is wrong. Since I was trying to replace pyaudio.PyAudio() streaming with audio file.