Loading a dynamically quantized Transformers model


I’ve trained a custom transformer model and followed this to save a quantized model.

However when I try to load the model using


I receive the following error:

Missing key(s) in state_dict: “xxxxx.weight”,
Unexpected key(s) in state_dict: “xxxx.scale, xxxx.zero_point, …”

It looks like the names of the original parameters of the model have been changed. Can anyone help with how I can resolve this error?

While loading the model is the model now a quantized model? If you convert the model to quantized model and then load the quantized state_dict it should work.

I’m not sure that I understand, assuming class A inherits from nn.Module and corresponds to the architecture of my dnn.

model = A()
is essentially all I do. Do I need to do anything to quantize it?

Ok I think I get it now, I have to do something like this after

model = A

quantized_model = torch.quantization.quantize_dynamic(
    model, {torch.nn.Linear}, dtype=torch.qint8