Inference error after int8 quantization with pytorch

FengMu1995 · December 13, 2021, 2:57am

when I inferenced my model with int8 quantization， I meet the following error: what should I do to solve it?

NotImplementedError: Could not run ‘quantized::conv2d.new’ with arguments from the ‘CPU’ backend. This could be because the operator doesn’t exist for this backend, or was omitted during the selective/custom build process (if using custom build).

anantguptadbl · December 13, 2021, 5:43am

Can you provide the model code which you are trying to quantize. FYI quantization is not implemented yet for CUDA

At the moment PyTorch doesn’t provide quantized operator implementations on CUDA - this is the direction for future work. Move the model to CPU in order to test the quantized functionality.

FengMu1995 · December 13, 2021, 7:57am

I inference the model in cpu mode. I changed to use the graph mode to quantify the model, but the erros generated:

isn’t the if branch supported？
def forward(self, x):
if x.ndim == 5:
return self.forward_time_series(x)
else:
return self.forward_single_frame(x)

anantguptadbl · December 13, 2021, 8:43am

@FengMu1995 the if branch is definitely supported

Here is an example. Without looking at the entire code it would be difficult to understand the issue. I feel that there are elements that you are switching between GPU and CPU , and since quantization does not work on the GPU it throws an error

class random_model(nn.Module):
    def __init__(self):
        super(random_model, self).__init__()
        self.model1 = nn.Sequential(
            nn.Linear(100, 10), 
            nn.BatchNorm1d(10),
            nn.ReLU(),
            nn.Linear(10, 4), 
            nn.BatchNorm1d(4),
            nn.ReLU(),
            nn.Linear(4, 1),
        )
        self.model2 = nn.Sequential(
            nn.Linear(100, 10), 
            nn.BatchNorm1d(10),
            nn.ReLU(),
            nn.Linear(10, 1),
        )
    def forward(self, x, flag_condition=True):
        if flag_condition==True:
            return self.model1(x)
        else:
            return self.model2(x)
        
X = torch.rand(100, 100)
y = torch.randint(2,(100,)).type(torch.FloatTensor)
model = random_model()
criterion = nn.MSELoss()
num_epochs = 100
learning_rate = 1e-2
optimizer = torch.optim.SGD(model.parameters(), lr=0.001)

for cur_epoch in range(num_epochs):
    model.zero_grad()
    if cur_epoch % 2 ==0:
        output = model(X, flag_condition=True)
    else:
        output = model(X, flag_condition=False)
    loss = criterion(y, output)
    loss.backward()
    optimizer.step()
    print("Cur Epoch {0} loss is {1}".format(cur_epoch, loss.item()))

jerryzh168 · December 13, 2021, 6:58pm

please take a look at Quantization — PyTorch 2.1 documentation

FengMu1995 · December 14, 2021, 2:44am

the “if branch” was not used to switch the gpu and cpu, It is only for judging the input dismension
source code as following:
if x.ndim == 5:
return self.forward_time_series(x)
else:
return self.forward_single_frame(x)

errors as following:

File “/home/wudi/Software/yes/envs/torch1.9/lib/python3.8/site-packages/torch/quantization/quantize_jit.py”, line 54, in _prepare_jit
model_c = torch._C._jit_pass_insert_observers(model._c,
RuntimeError: branches for if should return values that are observed consistently, if node:%5 : Tensor[] = prim::If(%4) # /data00/peterlin/RVM/model/mobilenetv3.py:69:8
block0():
%6 : Tensor[] = prim::CallMethod[name=“forward_time_series”](%self.1, %x.1) # /data00/peterlin/RVM/model/mobilenetv3.py:70:19
→ (%6)
block1():
%7 : Tensor[] = prim::CallMethod[name=“forward_single_frame”](%self.1, %x.1) # /data00/peterlin/RVM/model/mobilenetv3.py:72:19
→ (%7)

FengMu1995 · December 14, 2021, 2:44am

the problem has been solved

anantguptadbl · December 14, 2021, 3:56am

@FengMu1995 aah okay. Let me try that as well. Sounds interesting

anantguptadbl · December 14, 2021, 3:56am

Please share the solution

Eva1 · December 15, 2021, 6:08am

I have met the same problem as you , I guess that’s because your original model is built By conv2d , but your quantized model is built By quantized::conv2d，when you try to restore a quantized model for disk , it cannot run quantized::conv2d.new on conv2d.

FengMu1995 · December 20, 2021, 2:09am

it is different， this “if branch” has not been solved

jerryzh168 · January 7, 2022, 6:17pm

can you try using eager mode to quantize the if branch? also can you describe the problem in a bit more details, in terms of what are you trying to achieve, and what is the output

lslsldjehg · June 3, 2024, 12:36pm

@FengMu1995 I would like to ask you how to solve this problem, I also have the same problem, please help solve it. NotImplementedError: Could not run ‘quantized::conv2d.new’ with arguments from the ‘CPU’ backend. This could be because the operator doesn’t exist for this backend, or was omitted during the selective/custom build process (if using custom build).