Exclude specific layers from quantization

Hi,

i’m working on network quantization using the pytorch 2 export and AO and i would like to apply layer specific quantization strategies to my model to maximize performance. In AO when writing your quantizer, you can define how to annotate (or not to annotate) a module/function based on its type but i don’ see any way to change quantization strategy based on the position of the module/function in the network. Let say i have a U-NET architecture and i want to avoid quantizing the first layer of the encoder, how i can achieve this ?

thanks

you can probably write code to achieve this, something like:

def annotate(self, gm: GraphModule):
    linear_count = 0
    for n in gm.graph.nodes:
        if n.op == "call_module" and type(..module type) == torch.nn.Linear and linear_counter == 1:
            annotate_linear(...)