Models with different paramater amount have simililar CPU usage

Sining_Sun · November 13, 2020, 10:00am

Hi all,

Recently, I deployed my models to android using libtorch (1.6). Models are quantized and exported using torch.jit.script. When I do inference using C++, I find that model with 0.8M parameters has the same cpu usage as the model with 2M parameters. I have set thread number to 1 using the flowing code:

at::set_num_threads(1);
at::set_num_interop_threads(1);

How can I know if I’m using qnnpack when I do inference?