Could not run 'quantized::conv2d.new' with arguments from the 'QuantizedCUDA' backend

x=self.quant(x)
x=self.conv(x)
x=self.bn(x)
x=self.act(x)
x=self.dequant(x)

I trained a QAT model and when i tried evaluating the model, i got the error.

Could not run ‘quantized::conv2d.new’ with arguments from the ‘QuantizedCUDA’ backend … … ‘quantized::conv2d.new’ is only available for these backends: [QuantizedCPU, …].

when i added x = x.to(‘cpu’) before x = self.quant(x), to make it a QuantizedCPU backend (note that doing so, i am unable to train the model again as i will get:

RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

which is another problem…?), i will then get:

Could not run ‘aten::silu.out’ with arguments from the ‘QuantizedCPU’ backend. This could be because the operator doesn’t exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit Internal Login for possible resolutions. ‘aten::silu.out’ is only available for these backends: [CPU, CUDA,…

so i changed position of dequant to
x=x.to(‘cpu’)
x=self.quant(x)
x=self.conv(x)
x=self.bn(x)
x=self.dequant(x)
x=self.act(x)

I get the error pointing to x=self.quant(x) :

Could not run ‘aten::quantize_per_tensor’ with arguments from the ‘QuantizedCPU’ backend. This could be because the operator doesn’t exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit Internal Login for possible resolutions. ‘aten::quantize_per_tensor’ is only available for these backends: [CPU, CUDA,

and if i remove x = self.quant(x), i get back:

Could not run ‘quantized::conv2d.new’ with arguments from the ‘CPU’ backend. This could be because the operator doesn’t exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit Internal Login for possible resolutions. ‘quantized::conv2d.new’ is only available for these backends: [QuantizedCPU,

Please help as i’ve been encountering errors after errors even after searching online for solutions.

Hi, I’m also getting same your problem. If possible, Could you give me some solutions to fix this error?

Hi, I have not found the solution
cc @jerryzh168

this error occurs when you try to run a quantized op with weight or input on cuda. For example if you were to take a correctly quantized model and then do .to(‘cuda’) and then run the model, you’d get this error.

based on the second error message, your weight is on cuda. Note: changing where x.to(‘cpu’) is located will not fix this problem if the actual op weight is on cuda.

my guess is that somewhere in your code you have model.to(‘cuda’) (likely during training) and you are not converting it back to cpu i.e. model.to(‘cpu’) before trying to do quantization.

addtionally, it looks like your self.act op is aten::silu which isn’t being converted to a quantized op (looks like it doesn’t have a quantized implementation pytorch/activation.py at master · pytorch/pytorch · GitHub). You can either implement it yourself or change to something along the lines of

y = sigmoid(x)
x = y * x

I would also maybe start with a less weird model and make sure the flow works for you before iterating on that. Something like: (beta) Static Quantization with Eager Mode in PyTorch — PyTorch Tutorials 1.9.1+cu102 documentation could be a good starting point.

my guess is that somewhere in your code you have model.to(‘cuda’) (likely during training) and you are not converting it back to cpu i.e. model.to(‘cpu’) before trying to do quantization.

Strange because I have done model.to(‘cpu’) before torch.quantization.convert(model)

you can inspect the model and identify whether the weight is stored correctly, its possible its not transfering over or something, though usually modules move over their attributes by default.