Hello,
I tried to quantize opt model using FX Graph Mode Quantization.
model_fp is huggingface opt model.
I tried to follow up the following quantization tutorial.
model_to_quantize = copy.deepcopy(model_fp)
model_to_quantize.eval()
qconfig_mapping = QConfigMapping().set_global(torch.ao.quantization.default_dynamic_qconfig)
a tuple of one or more example inputs are needed to trace the model
example_inputs = (input_fp32)
prepare
model_prepared = quantize_fx.prepare_fx(model_to_quantize, qconfig_mapping, example_inputs)
I got the following error.
Could you help me how to fix this error?
File ~/transformers/models/opt/modeling_opt.py:625 in forward
raise ValueError(“You cannot specify both decoder_input_ids and decoder_inputs_embeds at the same time”)