How to adjust the model to eliminate errors in convert_fx()?

jzdcf · June 25, 2024, 8:33am

Based on the output of symbolic_traced.graph, I have adjusted the model to support symbol tracking. Now that the output of symbolic_traced.graph no longer reports errors, I need to verify what else is needed to ensure that the model can be quantified. What I mean is, the convert_fx() function reports the following error. How to adjust the model ?

from torch.ao.quantization import (
get_default_qconfig_mapping,
get_default_qat_qconfig_mapping,
QConfigMapping,
)
import torch.ao.quantization.quantize_fx as quantize_fx

jerryzh168 · July 1, 2024, 10:27pm

this should be supported I think, can you print the error?

jerryzh168 · July 1, 2024, 10:28pm

can you checkout our new quantization mode? Quantization — PyTorch main documentation

jzdcf · July 2, 2024, 12:41am

In addition to the fx mode, I also tried the export mode，it reports errors as the picture above
AssertionError: expecting kwargs for aten op IR to be empty

jerryzh168 · July 2, 2024, 2:57am

can you print the full error? it’s hard to see what is the problem just from the error message

jzdcf · July 2, 2024, 3:18am

This’s the full error,and my model contained nn.PixelShuffle,Is this operator supported ？

Traceback (most recent call last):
File “/home/user/ecbsr/ECBSR-main/predict.py”, line 63, in
prepared_model = prepare_pt2e(exported_model, quantizer)
File “/home/user/miniconda3/envs/ecbsr/lib/python3.9/site-packages/torch/ao/quantization/quantize_pt2e.py”, line 109, in prepare_pt2e
model = prepare(model, node_name_to_scope, is_qat=False)
File “/home/user/miniconda3/envs/ecbsr/lib/python3.9/site-packages/torch/ao/quantization/pt2e/prepare.py”, line 475, in prepare
_maybe_insert_input_and_output_observers_for_node(node, model, obs_or_fq_map, is_qat)
File “/home/user/miniconda3/envs/ecbsr/lib/python3.9/site-packages/torch/ao/quantization/pt2e/prepare.py”, line 416, in _maybe_insert_input_and_output_observers_for_node
_maybe_insert_input_observers_for_node(
File “/home/user/miniconda3/envs/ecbsr/lib/python3.9/site-packages/torch/ao/quantization/pt2e/prepare.py”, line 383, in _maybe_insert_input_observers_for_node
assert (
AssertionError: expecting kwargs for aten op IR to be empty

jerryzh168 · July 2, 2024, 5:23pm

this is the eror: pytorch/torch/ao/quantization/pt2e/prepare.py at main · pytorch/pytorch · GitHub

could you modify /home/user/miniconda3/envs/ecbsr/lib/python3.9/site-packages/torch/ao/quantization/pt2e/prepare.py to print what is the op? or just add a breakpoint there, it could help further identify the issue

jzdcf · July 5, 2024, 2:14am

    assert (
        node.target == torch.ops.aten.clone.default or
        node.target == torch.ops.aten.zeros_like.default or
        len(node.kwargs) == 0
    ), f" expecting kwargs for aten op IR to be empty | {node.target} | {node.kwargs}"

↑ I modified the source code and printed out the error message： expecting kwargs for aten op IR to be empty | aten.zeros.default | {‘device’: device(type=‘cpu’), ‘pin_memory’: False}

        # self.mask = torch.zeros((self.out_planes, 1, 3, 3), dtype=torch.float32)
        self.mask = torch.zeros(self.out_planes, 1, 3, 3)

↑ I modified the parameters ，but still reported the error

exported_model = symbolic_trace(model)

graph_module = exported_model.graph
print('before delete kargs:',exported_model.graph)

for node in graph_module.nodes:
    if node.op == 'call_function' and node.target == torch.ops.aten.zeros.default:
        new_args = list(node.args)
        new_args[1:] = []  
        node.args = tuple(new_args)

exported_model.recompile()
print('after delete kargs:',exported_model.graph)

quantizer = XNNPACKQuantizer()
quantizer.set_global(get_symmetric_quantization_config())

prepared_model = prepare_pt2e(exported_model, quantizer)

↑ I manually deleted the extra parameters in the graph

before quanti size: 87.1884765625
after  quanti size: 125.1064453125
before quanti predict time:32.16 ms
after  quanti predict time:23.80 ms

↑ The size of the parameters has increased, with only a slight performance improvement

jerryzh168 · July 16, 2024, 12:18am

we don’t expect to see size reduction after prepare_pt2e, can you use convert_pt2e to get a quantized model? also even at that point we still have to lower the model to some backends before we get speedup, see (prototype) PyTorch 2 Export Post Training Quantization — PyTorch Tutorials 2.3.0+cu121 documentation

jzdcf · July 18, 2024, 12:27am

The following code has already been executed, otherwise the model inference would not have executed properly.My question is why there is little change in memory usage when quantifying. When I use Intel NNCF quantization, they point out that the reason is "NNCF does not support quantization of custom PyTorch modules with weights."Does PyTorch have similar restrictions？

quantized_model = convert_pt2e(prepared_model)

jerryzh168 · July 18, 2024, 12:56am

can you see the doc: (prototype) PyTorch 2 Export Post Training Quantization — PyTorch Tutorials 2.3.0+cu121 documentation

the model you are getting after convert_pt2e is this: (prototype) PyTorch 2 Export Post Training Quantization — PyTorch Tutorials 2.3.0+cu121 documentation

so there won’t be perf or memory benefit at that point, you’ll need to lower this model to a specific hardware in order to get benefit