Question: backend for quantization in pytorch

mahuichao · June 16, 2023, 8:16am

Hello, guys
recently I learned the source code of pytorch, I quantized my cnn layer and see the backend of it’s implementation. My system is Mac M1, so I can’t use GPU(CUDA), so I can only use CPU. From director y “ATen/native/quantized/cpu” I can see a lot of quantized layers like “qconv” and so on.
In the file “qconv.cpp” I can see there are three macro: USE_FBGEM.
MUSE_PYTORCH_QNNPACK. AT_MKLDNN_ENABLED. Does this means that I can use anyone of these three? In another word, pytorch support three kind of quantization implementation in cpu and which one to use depends on what macro I defined ?

Thanks in advance.

HDCharles · June 16, 2023, 5:44pm

yeah, there are different quantization engines, some info here: Quantization — PyTorch 2.0 documentation

and a code pointer here:

github.com

pytorch/pytorch/blob/5875a2fb3c4f21efec326dff6ba31d793a7194e1/torch/ao/quantization/qconfig.py#L221-L260


      
          def get_default_qconfig(backend='x86', version=0):
              """
              Returns the default PTQ qconfig for the specified backend.
          
          
    Args:
                * `backend` (str): a string representing the target backend. Currently supports
                  `x86` (default), `fbgemm`, `qnnpack` and `onednn`.
          
          
    Return:
                  qconfig
              """
              supported_backends = ["fbgemm", "x86", "qnnpack", "onednn"]
              if backend not in supported_backends:
                  raise AssertionError(
                      "backend: " + str(backend) +
                      " not supported. backend must be one of {}".format(supported_backends)
                  )
          
          
    if version == 0:
                  if backend == 'fbgemm':

This file has been truncated. show original

I think there’s also some xnnpack support, not 100% sure

mahuichao · June 19, 2023, 11:16am

Thanks guys, it helps a lot.