Qunatizarion model for rasberry

Hi,
I quantizied my model using fbgemm backend for my x64 cpu, now I want to try it on a rasberry 4 where I compiled libtorch. I switched to the qnnpack backend and I quantizied model on mi cpu, ijted it, but when I try to load it, I get an error

RuntimeError: Didn’t find engine for operation quantized::conv2d_prepack NoQEngine

The libtorch is build with qnnpack. I must do the quantization on rasberry?
Thanks

           Alberto

can you provide more information? How do you compile the libtorch? When you perform the quantization, how do you set the qengine? Maybe share the minimum reproducible script?

Hi,
I use FX for quantization:

#backend = 'fbgemm''
backend = 'qnnpack'

model_q  = copy.deepcopy(model)
model_q.eval()
torch.backends.quantized.engine = backend
qconfig_dict = {"": torch.quantization.get_default_qconfig(backend)}
model_prepared = quantize_fx.prepare_fx(model_q, qconfig_dict)
model_prepared.eval()
print("Load dataset")
data = ImageFolder("") //Load dataset
datasets = train_val_dataset(data)
print("Validation dataset size {}".format(len(datasets['val'])))
dataLoaderVal = DataLoader(datasets['val'] , batch_size=64,
                           num_workers=4, pin_memory=True)

print("Calibration")
n = 0
with torch.inference_mode():
    for b in dataLoaderVal:
        n = n + 1
        model_prepared(b[0])
        if n > 100:
            break
print("End calibration")

model_quantized = quantize_fx.convert_fx(model_prepared)


example = torch.rand(1, 3, 244, 244)
traced_script_module = torch.jit.trace(model_quantized, example)
traced_script_module.save(name+"_q.pt")

Then I load the model in my c++ with module = torch::jit::load(model_q).
With fbgemm backen on x64 desktop it works without problem and also with qnnpack (it slower as aspected ).

On my rasberry I compiled litorch from sources, with

cmake -DBUILD_PYTHON=OFF -DUSE_CUDA=OFF -DBUILD_SHARED_LIBS=ON -DCMAKE_BUILD_TYPE=Release

and in the lin direcotiry there is libnnpack.a.

                  Alberto