QAT: trace_model 'QuantizedCPU' backend error

I have an QAT quantized_model which has no problem to run:

quantized_model.eval()
_ = quantized_model(torch.rand(1,3,300,300))

and it also can be traced successfuly:

trace_model = torch.jit.trace(quantized_model, torch.Tensor(1,3,300,300))

but when I tried to run the trace_model as below:

trace_model.eval()
 _ = trace_model(torch.rand(1,3,300,300))

I encountered the following error message:

  _ = trace_model(torch.rand(1,3,300,300))
  File "/home/paul/rknn2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
  File "<string>", line 63, in <foward op>
                       dilation: List[int],
                       ceil_mode: bool):
            output, indices = torch.max_pool2d_with_indices(self, kernel_size, stride, padding, dilation, ceil_mode)
                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
            def backward(grad_output):
                grad_self = torch.max_pool2d_with_indices_backward(grad_output, self, kernel_size, stride, padding, dilation, ceil_mode, indices)
RuntimeError: Could not run 'aten::max_pool2d_with_indices' with arguments from the 'QuantizedCPU' backend. 'aten::max_pool2d_with_indices' is only available for these backends: [CPU, CUDA, Named, Autograd, Profiler, Tracer].

What am I missing? How can I fix this error? Thank you for your help.

looks like the quantized torchscript is not playing well with the autograd, try running without gradient

with torch.no_grad():
_ = trace_model(torch.rand(1,3,300,300))

otherwise not sure, if you can give a minimal repro that would help.

@HDCharles

Thanks for the help. It works now!

The reason I create such test script is to duplicate the same “QuantizedCPU” backend error I encountered when I saved the trace_model into a file, says, “trace_model.pt” and then tried to load it by using

rknn.load_pytorch(model="trace_model.pt", input_size_list=input_size_list)

where rknn is a toolkit for RockChip NPU.

Now my question is: is there a way to save the trace_model with torch.no_grad() ? such that when rknn.load_pytorch would not encounter “QuantizedCPU” backend error?

or has to modify rknn.load_pytorch module to add

with torch.no_grad():

in it before parsing the model script?

Thank you again for your help.

you can try just shoving that into the forward loop of the model, not sure if that would work.

note: dont do something like:

if train_flag:
code
else:
with torch.no_grad
code

because the if statement will break the tracing.

alternatively you can try:

with torch.no_grad()
rknn.load_pytorch(…)