Could not run ‘aten::q_scale’ with arguments from the ‘CUDA’ backend. This could be because the operator doesn’t exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit Internal Login for possible resolutions. ‘aten::q_scale’ is only available for these backends: [QuantizedCPU, QuantizedCUDA, BackendSelect, Named, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, UNKNOWN_TENSOR_TYPE_ID, AutogradMLC, AutogradHPU, AutogradNestedTensor, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].
I get the same error when I try for the CPU .
I even try to detach and reassign to cuda after the operation and it still fails.
What I am doing seems to be a call to the C function which doesnt seem to be binded with python.
can we have an alternative way to extract tensor or a fix to this ?
As an input to the whole model I am passing the Input as Regular Tensor. Although, The Quant Stub Should be able to quantize the inputs and pass them right ?
I am adding the QuantWrapper to my model and then pass it to Prepare stage .
Somehow I am seeing that the input to the function is still a tensor.
quant stub gets replaced by a quantize op during convert, observers take in non-quantized tensors and analyze them, by calling x.q_scale() on the input in your observer, you are applying x.q_scale() on a normal tensor which is causing the error.