Quantized model not provide performance improvements

fel88 · July 16, 2020, 9:00pm

I try to run quantization benchmark:

But I didn’t see any speed up with quantized model.
For example:
googlenet:
Train time:
q: 192.940
f: 192.940
Test time:
q: 193.114
f: 193.114

On the third model I got:
Downloading: “https://download.pytorch.org/models/mobilenet_v2-b0353104.pth” to
…/.cache\torch\hub\checkpoints\mobilenet_v2-b0353104.pth
100%|█████████████████████████████████████| 13.6M/13.6M [00:01<00:00, 8.20MB/s]

File “…Anaconda3\envs\torch1.5\lib\site-packages\torchvision\mod
els\quantization\utils.py”, line 22, in quantize_model
raise RuntimeError("Quantized backend not supported ")
RuntimeError: Quantized backend not supported

Why does the quantized model not provide performance improvements?
How can I activate quantized backend?

My environment:
conda 4.8.3
torch 1.7.0.dev20200716
torchvision 0.8.0.dev20200716
Python 3.6.10

CPU: i5-4670

fel88 · July 20, 2020, 2:51pm

print(torch.backends.quantized.supported_engines)
[‘none’, ‘fbgemm’]

Vasiliy_Kuznetsov · July 20, 2020, 3:46pm

one thing to try would be:

torch.backends.quantized.engine = 'fbgemm'
model.qconfig = torch.quantization.get_default_qat_qconfig('fbgemm')

fel88 · July 21, 2020, 2:13pm

It still doesn’t work, but
https://pytorch.org/tutorials/advanced/static_quantization_tutorial.html
works well. It is enough for my purposes.