Not able to run quantized model on android

dklvch · July 30, 2020, 2:47pm

I have a quantized model which works on intel cpu and can be traced but fails to be run on android. Float32 model works fine on mobile though. Unfortunately, I cannot share the model. I get the following error:

java.lang.IllegalArgumentException: at::Tensor scalar type is not supported on java side

Environment

PyTorch version: 1.7.0.dev20200727
Is debug build: No
CUDA used to build PyTorch: 10.2

OS: Ubuntu 20.04 LTS
GCC version: (Ubuntu 8.4.0-3ubuntu2) 8.4.0
CMake version: version 3.16.3

Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: 10.2.89
GPU models and configuration: GPU 0: GeForce GTX 1070
Nvidia driver version: 440.100
cuDNN version: Could not collect

Versions of relevant libraries:
[pip3] numpy==1.19.0
[pip3] torch==1.7.0.dev20200727
[pip3] torchvision==0.8.0.dev20200727
[conda] Could not collect

cc @jerryzh168 @jianyuh @dzhulgakov @raghuramank100 @jamesr66a @vkuzo

dzhulgakov · July 30, 2020, 6:53pm

Most likely you’re trying to return a quantized tensor (we should improve the error message ). We don’t have Java binding for quantized tensor yet. You can try to dequantize the tensor within your model (something like result.dequantize()) or return individual components of the tensor (result.int_repr(), result.q_scale(), result.q_zero_point())