when I inferenced my model with int8 quantization, I meet the following error: what should I do to solve it?
NotImplementedError: Could not run ‘quantized::conv2d.new’ with arguments from the ‘CPU’ backend. This could be because the operator doesn’t exist for this backend, or was omitted during the selective/custom build process (if using custom build).
Can you provide the model code which you are trying to quantize. FYI quantization is not implemented yet for CUDA
At the moment PyTorch doesn’t provide quantized operator implementations on CUDA - this is the direction for future work. Move the model to CPU in order to test the quantized functionality.
Here is an example. Without looking at the entire code it would be difficult to understand the issue. I feel that there are elements that you are switching between GPU and CPU , and since quantization does not work on the GPU it throws an error
class random_model(nn.Module):
def __init__(self):
super(random_model, self).__init__()
self.model1 = nn.Sequential(
nn.Linear(100, 10),
nn.BatchNorm1d(10),
nn.ReLU(),
nn.Linear(10, 4),
nn.BatchNorm1d(4),
nn.ReLU(),
nn.Linear(4, 1),
)
self.model2 = nn.Sequential(
nn.Linear(100, 10),
nn.BatchNorm1d(10),
nn.ReLU(),
nn.Linear(10, 1),
)
def forward(self, x, flag_condition=True):
if flag_condition==True:
return self.model1(x)
else:
return self.model2(x)
X = torch.rand(100, 100)
y = torch.randint(2,(100,)).type(torch.FloatTensor)
model = random_model()
criterion = nn.MSELoss()
num_epochs = 100
learning_rate = 1e-2
optimizer = torch.optim.SGD(model.parameters(), lr=0.001)
for cur_epoch in range(num_epochs):
model.zero_grad()
if cur_epoch % 2 ==0:
output = model(X, flag_condition=True)
else:
output = model(X, flag_condition=False)
loss = criterion(y, output)
loss.backward()
optimizer.step()
print("Cur Epoch {0} loss is {1}".format(cur_epoch, loss.item()))
the “if branch” was not used to switch the gpu and cpu, It is only for judging the input dismension
source code as following:
if x.ndim == 5:
return self.forward_time_series(x)
else:
return self.forward_single_frame(x)
errors as following:
File “/home/wudi/Software/yes/envs/torch1.9/lib/python3.8/site-packages/torch/quantization/quantize_jit.py”, line 54, in _prepare_jit
model_c = torch._C._jit_pass_insert_observers(model._c,
RuntimeError: branches for if should return values that are observed consistently, if node:%5 : Tensor[] = prim::If(%4) # /data00/peterlin/RVM/model/mobilenetv3.py:69:8
block0():
%6 : Tensor[] = prim::CallMethod[name=“forward_time_series”](%self.1, %x.1) # /data00/peterlin/RVM/model/mobilenetv3.py:70:19
→ (%6)
block1():
%7 : Tensor[] = prim::CallMethod[name=“forward_single_frame”](%self.1, %x.1) # /data00/peterlin/RVM/model/mobilenetv3.py:72:19
→ (%7)
I have met the same problem as you , I guess that’s because your original model is built By conv2d , but your quantized model is built By quantized::conv2d,when you try to restore a quantized model for disk , it cannot run quantized::conv2d.new on conv2d.
can you try using eager mode to quantize the if branch? also can you describe the problem in a bit more details, in terms of what are you trying to achieve, and what is the output
@FengMu1995 I would like to ask you how to solve this problem, I also have the same problem, please help solve it. NotImplementedError: Could not run ‘quantized::conv2d.new’ with arguments from the ‘CPU’ backend. This could be because the operator doesn’t exist for this backend, or was omitted during the selective/custom build process (if using custom build).