Hello! I am trying to quantize a pretrained resnet50 model, but I am running into the error.

```
NotImplementedError: Could not run 'aten::quantize_per_tensor' with arguments from the 'QuantizedCPU' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::quantize_per_tensor' is only available for these backends: [CPU, CUDA, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMeta, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradNestedTensor, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PythonDispatcher].
```

I can’t figure out why this is going wrong. If someone could help me out that would be amazing. Here is my code below.

```
import torch
import numpy as np
import time
class QuantizedModel(torch.nn.Module):
def __init__(self, model):
super().__init__()
self.model_fp32 = model
self.quant = torch.quantization.QuantStub()
self.dequant = torch.quantization.DeQuantStub()
def forward(self, x):
x = self.quant(x)
x = self.model_fp32(x)
x = self.dequant(x)
return x
model = torch.hub.load('pytorch/vision', 'resnet50', pretrained=True).to('cuda')
quant_model = QuantizedModel(model)
quant_model.eval()
quant_model.qconfig = torch.ao.quantization.default_qconfig
print(quant_model.qconfig)
quant_model = torch.ao.quantization.prepare(quant_model, inplace=False)
quant_model = torch.ao.quantization.convert(quant_model, inplace=False)
input_tensor = preprocess(img).unsqueeze(0)
input_tensor = torch.quantize_per_tensor(input_tensor, scale=1.0, zero_point=0, dtype=torch.quint8)
output_batch_tensor = quant_model(input_tensor)
```

I saw a thread very similar to this one, but I was not able to find the answer to my problem from looking at that thread.