Error in FX Graph Mode Quantization

Hello,
I tried to quantize opt model using FX Graph Mode Quantization.

model_fp is huggingface opt model.

I tried to follow up the following quantization tutorial.

model_to_quantize = copy.deepcopy(model_fp)
model_to_quantize.eval()
qconfig_mapping = QConfigMapping().set_global(torch.ao.quantization.default_dynamic_qconfig)

a tuple of one or more example inputs are needed to trace the model

example_inputs = (input_fp32)

prepare

model_prepared = quantize_fx.prepare_fx(model_to_quantize, qconfig_mapping, example_inputs)

I got the following error.
Could you help me how to fix this error?

File ~/transformers/models/opt/modeling_opt.py:625 in forward
raise ValueError(“You cannot specify both decoder_input_ids and decoder_inputs_embeds at the same time”)

ValueError: You cannot specify both decoder_input_ids and decoder_inputs_embeds at the same time

can you provide a repro or more details, that error seems unrelated to quantization given that its occuring in transformers/models/opt/modeling_opt directory

I haven’t tried this, but I think you’ll need to use huggingface symbolic tracer to trace the model first, and then try quantizing with our API: transformers/src/transformers/utils/fx.py at main · huggingface/transformers · GitHub