I want to quantize a model that contains Conv3d layers and run the model on Android devices. When I follow the tutorial (Quantization — PyTorch 1.11.0 documentation) to perform static quantization, the Android device throws the following exception when loading the quantized model:
com.facebook.jni.CppException: prepack/__setstate__: QNNPACK only supports Conv2d now. ()
Thus, I want to exclude the Conv3d layers during quantization (i.e. the Conv3d layer is not quantized), and the quantization is done as follows:
import torch, torch.nn as nn from torch.quantization import QuantStub, DeQuantStub, get_qconfig_propagation_list from torch.utils.mobile_optimizer import optimize_for_mobile input_fp32 = torch.rand(size=(1, 3, 16, 224, 224)) # just an example model model_fp32 = nn.Sequential( QuantStub(), nn.Conv3d(in_channels=3, out_channels=3, kernel_size=5, stride=1), nn.Linear(220, 220), DeQuantStub(), ) model_fp32.eval() model_fp32.qconfig = torch.quantization.get_default_qconfig('qnnpack') model_fp32 = torch.quantization.prepare(model_fp32) qconfig_propagation_list = list(filter(lambda x: x != torch.nn.modules.conv.Conv3d, get_qconfig_propagation_list())) model_fp32_prepared = torch.quantization.prepare(model_fp32, allow_list=qconfig_propagation_list) model_fp32_prepared(input_fp32) model_int8 = torch.quantization.convert(model_fp32_prepared) traced_model = torch.jit.trace(model_int8, input_fp32) traced_script_module_optimized = optimize_for_mobile(traced_model) traced_script_module_optimized._save_for_lite_interpreter('model_quantized_int8.pkl')
The same exception that says
QNNPACK only supports Conv2d now. () persists when loading the quantized model. Did I make mistake in the above code? Or does it mean that when running quantized models on Android devices, the model must not contain any layer that is not supported by QNNPACK, even if these layers are not quantized?