How to adjust the model to eliminate errors in convert_fx()?

Based on the output of symbolic_traced.graph, I have adjusted the model to support symbol tracking. Now that the output of symbolic_traced.graph no longer reports errors, I need to verify what else is needed to ensure that the model can be quantified. What I mean is, the convert_fx() function reports the following error. How to adjust the model ?

from torch.ao.quantization import (
get_default_qconfig_mapping,
get_default_qat_qconfig_mapping,
QConfigMapping,
)
import torch.ao.quantization.quantize_fx as quantize_fx

this should be supported I think, can you print the error?

can you checkout our new quantization mode? Quantization — PyTorch main documentation


In addition to the fx mode, I also tried the export mode,it reports errors as the picture above
AssertionError: expecting kwargs for aten op IR to be empty

can you print the full error? it’s hard to see what is the problem just from the error message

This’s the full error,and my model contained nn.PixelShuffle,Is this operator supported ?

Traceback (most recent call last):
File “/home/user/ecbsr/ECBSR-main/predict.py”, line 63, in
prepared_model = prepare_pt2e(exported_model, quantizer)
File “/home/user/miniconda3/envs/ecbsr/lib/python3.9/site-packages/torch/ao/quantization/quantize_pt2e.py”, line 109, in prepare_pt2e
model = prepare(model, node_name_to_scope, is_qat=False)
File “/home/user/miniconda3/envs/ecbsr/lib/python3.9/site-packages/torch/ao/quantization/pt2e/prepare.py”, line 475, in prepare
_maybe_insert_input_and_output_observers_for_node(node, model, obs_or_fq_map, is_qat)
File “/home/user/miniconda3/envs/ecbsr/lib/python3.9/site-packages/torch/ao/quantization/pt2e/prepare.py”, line 416, in _maybe_insert_input_and_output_observers_for_node
_maybe_insert_input_observers_for_node(
File “/home/user/miniconda3/envs/ecbsr/lib/python3.9/site-packages/torch/ao/quantization/pt2e/prepare.py”, line 383, in _maybe_insert_input_observers_for_node
assert (
AssertionError: expecting kwargs for aten op IR to be empty

this is the eror: pytorch/torch/ao/quantization/pt2e/prepare.py at main · pytorch/pytorch · GitHub

could you modify /home/user/miniconda3/envs/ecbsr/lib/python3.9/site-packages/torch/ao/quantization/pt2e/prepare.py to print what is the op? or just add a breakpoint there, it could help further identify the issue

    assert (
        node.target == torch.ops.aten.clone.default or
        node.target == torch.ops.aten.zeros_like.default or
        len(node.kwargs) == 0
    ), f" expecting kwargs for aten op IR to be empty | {node.target} | {node.kwargs}"

↑ I modified the source code and printed out the error message: expecting kwargs for aten op IR to be empty | aten.zeros.default | {‘device’: device(type=‘cpu’), ‘pin_memory’: False}

        # self.mask = torch.zeros((self.out_planes, 1, 3, 3), dtype=torch.float32)
        self.mask = torch.zeros(self.out_planes, 1, 3, 3)

↑ I modified the parameters ,but still reported the error

exported_model = symbolic_trace(model)

graph_module = exported_model.graph
print('before delete kargs:',exported_model.graph)

for node in graph_module.nodes:
    if node.op == 'call_function' and node.target == torch.ops.aten.zeros.default:
        new_args = list(node.args)
        new_args[1:] = []  
        node.args = tuple(new_args)

exported_model.recompile()
print('after delete kargs:',exported_model.graph)

quantizer = XNNPACKQuantizer()
quantizer.set_global(get_symmetric_quantization_config())

prepared_model = prepare_pt2e(exported_model, quantizer)

↑ I manually deleted the extra parameters in the graph

before quanti size: 87.1884765625
after  quanti size: 125.1064453125
before quanti predict time:32.16 ms
after  quanti predict time:23.80 ms

↑ The size of the parameters has increased, with only a slight performance improvement

we don’t expect to see size reduction after prepare_pt2e, can you use convert_pt2e to get a quantized model? also even at that point we still have to lower the model to some backends before we get speedup, see (prototype) PyTorch 2 Export Post Training Quantization — PyTorch Tutorials 2.3.0+cu121 documentation

The following code has already been executed, otherwise the model inference would not have executed properly.My question is why there is little change in memory usage when quantifying. When I use Intel NNCF quantization, they point out that the reason is "NNCF does not support quantization of custom PyTorch modules with weights."Does PyTorch have similar restrictions?

quantized_model = convert_pt2e(prepared_model)

can you see the doc: (prototype) PyTorch 2 Export Post Training Quantization — PyTorch Tutorials 2.3.0+cu121 documentation

the model you are getting after convert_pt2e is this: (prototype) PyTorch 2 Export Post Training Quantization — PyTorch Tutorials 2.3.0+cu121 documentation

so there won’t be perf or memory benefit at that point, you’ll need to lower this model to a specific hardware in order to get benefit