QAT: trace_model 'QuantizedCPU' backend error

ynjiun_wang · October 11, 2021, 11:26pm

I have an QAT quantized_model which has no problem to run:

quantized_model.eval()
_ = quantized_model(torch.rand(1,3,300,300))

and it also can be traced successfuly:

trace_model = torch.jit.trace(quantized_model, torch.Tensor(1,3,300,300))

but when I tried to run the trace_model as below:

trace_model.eval()
 _ = trace_model(torch.rand(1,3,300,300))

I encountered the following error message:

  _ = trace_model(torch.rand(1,3,300,300))
  File "/home/paul/rknn2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
  File "<string>", line 63, in <foward op>
                       dilation: List[int],
                       ceil_mode: bool):
            output, indices = torch.max_pool2d_with_indices(self, kernel_size, stride, padding, dilation, ceil_mode)
                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
            def backward(grad_output):
                grad_self = torch.max_pool2d_with_indices_backward(grad_output, self, kernel_size, stride, padding, dilation, ceil_mode, indices)
RuntimeError: Could not run 'aten::max_pool2d_with_indices' with arguments from the 'QuantizedCPU' backend. 'aten::max_pool2d_with_indices' is only available for these backends: [CPU, CUDA, Named, Autograd, Profiler, Tracer].

What am I missing? How can I fix this error? Thank you for your help.

HDCharles · October 14, 2021, 10:17pm

looks like the quantized torchscript is not playing well with the autograd, try running without gradient

with torch.no_grad():
_ = trace_model(torch.rand(1,3,300,300))

otherwise not sure, if you can give a minimal repro that would help.

ynjiun_wang · October 14, 2021, 11:07pm

@HDCharles

Thanks for the help. It works now!

The reason I create such test script is to duplicate the same “QuantizedCPU” backend error I encountered when I saved the trace_model into a file, says, “trace_model.pt” and then tried to load it by using

rknn.load_pytorch(model="trace_model.pt", input_size_list=input_size_list)

where rknn is a toolkit for RockChip NPU.

Now my question is: is there a way to save the trace_model with torch.no_grad() ? such that when rknn.load_pytorch would not encounter “QuantizedCPU” backend error?

or has to modify rknn.load_pytorch module to add

with torch.no_grad():

in it before parsing the model script?

Thank you again for your help.

HDCharles · October 15, 2021, 4:21pm

you can try just shoving that into the forward loop of the model, not sure if that would work.

note: dont do something like:

if train_flag:
code
else:
with torch.no_grad
code

because the if statement will break the tracing.

alternatively you can try:

with torch.no_grad()
rknn.load_pytorch(…)