Qunatizarion model for rasberry

AlOa · October 14, 2022, 9:51am

Hi,
I quantizied my model using fbgemm backend for my x64 cpu, now I want to try it on a rasberry 4 where I compiled libtorch. I switched to the qnnpack backend and I quantizied model on mi cpu, ijted it, but when I try to load it, I get an error

RuntimeError: Didn’t find engine for operation quantized::conv2d_prepack NoQEngine

The libtorch is build with qnnpack. I must do the quantization on rasberry?
Thanks

           Alberto

Zafar · October 17, 2022, 4:51pm

can you provide more information? How do you compile the libtorch? When you perform the quantization, how do you set the qengine? Maybe share the minimum reproducible script?

AlOa · October 18, 2022, 9:05am

Hi,
I use FX for quantization:

#backend = 'fbgemm''
backend = 'qnnpack'

model_q  = copy.deepcopy(model)
model_q.eval()
torch.backends.quantized.engine = backend
qconfig_dict = {"": torch.quantization.get_default_qconfig(backend)}
model_prepared = quantize_fx.prepare_fx(model_q, qconfig_dict)
model_prepared.eval()
print("Load dataset")
data = ImageFolder("") //Load dataset
datasets = train_val_dataset(data)
print("Validation dataset size {}".format(len(datasets['val'])))
dataLoaderVal = DataLoader(datasets['val'] , batch_size=64,
                           num_workers=4, pin_memory=True)

print("Calibration")
n = 0
with torch.inference_mode():
    for b in dataLoaderVal:
        n = n + 1
        model_prepared(b[0])
        if n > 100:
            break
print("End calibration")

model_quantized = quantize_fx.convert_fx(model_prepared)


example = torch.rand(1, 3, 244, 244)
traced_script_module = torch.jit.trace(model_quantized, example)
traced_script_module.save(name+"_q.pt")

Then I load the model in my c++ with module = torch::jit::load(model_q).
With fbgemm backen on x64 desktop it works without problem and also with qnnpack (it slower as aspected ).

On my rasberry I compiled litorch from sources, with

cmake -DBUILD_PYTHON=OFF -DUSE_CUDA=OFF -DBUILD_SHARED_LIBS=ON -DCMAKE_BUILD_TYPE=Release

and in the lin direcotiry there is libnnpack.a.

                  Alberto