What does the custom Observed module do?

I am a little confused about the custom observed module when I read the Quantization Custom Module API docs.

The code snippet in the docs are as follows:

# custom observed module, provided by user
class ObservedCustomModule(torch.nn.Module):
    def __init__(self, linear):
        super().__init__()
        self.linear = linear

    def forward(self, x):
        return self.linear(x)

    @classmethod
    def from_float(cls, float_module):
        assert hasattr(float_module, 'qconfig')
        observed = cls(float_module.linear)
        observed.qconfig = float_module.qconfig
        return observed

The ObservedCustomModule class seems to do the same as CustomModule.linear (torch.nn.Linear) except add the qconfig attribute.

Can someone provide a built-in observed module for reference?

If I want to customize an observed module for quantizing conv2d with 2 bits, what should I do?

hi @111357 , this is a hook for user to insert customizable quantization modules.

If you wanted to set up something like conv2d with 2 bit weights, you’d need

  1. kernels that can do inference with 2 bit weights (or kernels that emulate this). There is no such thing in PyTorch today, but you could provide a custom function. You could write this in Python to at least get correct numerics.
  2. either an observer or a fake_quantize module which is set up for 2 bits. You can do this by setting qmin/qmax settings in the default observers provided by quantization.
  3. once you have those things, you can plug them in to the framework using Quantization — PyTorch 1.13 documentation