Not able to run quantized model on android

I have a quantized model which works on intel cpu and can be traced but fails to be run on android. Float32 model works fine on mobile though. Unfortunately, I cannot share the model. I get the following error:

java.lang.IllegalArgumentException: at::Tensor scalar type is not supported on java side

Environment

PyTorch version: 1.7.0.dev20200727
Is debug build: No
CUDA used to build PyTorch: 10.2

OS: Ubuntu 20.04 LTS
GCC version: (Ubuntu 8.4.0-3ubuntu2) 8.4.0
CMake version: version 3.16.3

Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: 10.2.89
GPU models and configuration: GPU 0: GeForce GTX 1070
Nvidia driver version: 440.100
cuDNN version: Could not collect

Versions of relevant libraries:
[pip3] numpy==1.19.0
[pip3] torch==1.7.0.dev20200727
[pip3] torchvision==0.8.0.dev20200727
[conda] Could not collect

cc @jerryzh168 @jianyuh @dzhulgakov @raghuramank100 @jamesr66a @vkuzo

Most likely you’re trying to return a quantized tensor (we should improve the error message :slight_smile: ). We don’t have Java binding for quantized tensor yet. You can try to dequantize the tensor within your model (something like result.dequantize()) or return individual components of the tensor (result.int_repr(), result.q_scale(), result.q_zero_point())