Quantization of EfficientNet Poor Performance

Hi there!

I am currently trying to quantize an EfficientNet MultiHead model (from timm) using the Post Training Static quantization approach mentioned in the PyTorch documentation (Eager Mode).

Unfortunately, the model’s performance decreases significantly after being quantized (90% accuracy to 49%).

For the sake of simplicity, I am only showing a SingleHead version of the model (a few of the InvertedResidual layers are cut out due to character limits on the post, indicated by “…”), however it also performs very poorly.

EffNet_multihead(
  (features): Sequential(
    (0): Conv2dSame(3, 24, kernel_size=(3, 3), stride=(2, 2), bias=False)
    (1): QuantStub()
    (2): BatchNorm2d(24, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
    (3): DeQuantStub()
    (4): SiLU(inplace=True)
    (5): Sequential(
      (0): ConvBnAct(
        (conv): Conv2d(24, 24, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(24, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (1): ConvBnAct(
        (conv): Conv2d(24, 24, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(24, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
    )
    (6): Sequential(
      (0): EdgeResidual(
        (conv_exp): Conv2dSame(24, 96, kernel_size=(3, 3), stride=(2, 2), bias=False)
        (bn1): BatchNorm2d(96, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (se): Identity()
        (conv_pwl): Conv2d(96, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn2): BatchNorm2d(48, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (1): EdgeResidual(
        (conv_exp): Conv2d(48, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (se): Identity()
        (conv_pwl): Conv2d(192, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn2): BatchNorm2d(48, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (2): EdgeResidual(
        (conv_exp): Conv2d(48, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (se): Identity()
        (conv_pwl): Conv2d(192, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn2): BatchNorm2d(48, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (3): EdgeResidual(
        (conv_exp): Conv2d(48, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (se): Identity()
        (conv_pwl): Conv2d(192, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn2): BatchNorm2d(48, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
    )
    (7): Sequential(
      (0): EdgeResidual(
        (conv_exp): Conv2dSame(48, 192, kernel_size=(3, 3), stride=(2, 2), bias=False)
        (bn1): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (se): Identity()
        (conv_pwl): Conv2d(192, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (1): EdgeResidual(
        (conv_exp): Conv2d(64, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (se): Identity()
        (conv_pwl): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (2): EdgeResidual(
        (conv_exp): Conv2d(64, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (se): Identity()
        (conv_pwl): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (3): EdgeResidual(
        (conv_exp): Conv2d(64, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (se): Identity()
        (conv_pwl): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
    )
    (8): Sequential(
      (0): InvertedResidual(
        (conv_pw): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2dSame(256, 256, kernel_size=(3, 3), stride=(2, 2), groups=256, bias=False)
        (bn2): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(256, 16, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(16, 256, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(128, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (1): InvertedResidual(
        (conv_pw): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512, bias=False)
        (bn2): BatchNorm2d(512, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(512, 32, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(32, 512, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(128, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (2): InvertedResidual(
        (conv_pw): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512, bias=False)
        (bn2): BatchNorm2d(512, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(512, 32, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(32, 512, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(128, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (3): InvertedResidual(
        (conv_pw): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512, bias=False)
        (bn2): BatchNorm2d(512, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(512, 32, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(32, 512, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(128, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
...
      (6): InvertedResidual(
        (conv_pw): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(960, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False)
        (bn2): BatchNorm2d(960, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(960, 40, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(40, 960, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(960, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(160, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (7): InvertedResidual(
        (conv_pw): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(960, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False)
        (bn2): BatchNorm2d(960, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(960, 40, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(40, 960, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(960, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(160, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (8): InvertedResidual(
        (conv_pw): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(960, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False)
        (bn2): BatchNorm2d(960, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(960, 40, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(40, 960, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(960, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(160, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
    )
  )
  (head0): EffNet_head(
    (features): Sequential(
      (0): InvertedResidual(
        (conv_pw): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(960, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2dSame(960, 960, kernel_size=(3, 3), stride=(2, 2), groups=960, bias=False)
        (bn2): BatchNorm2d(960, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(960, 40, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(40, 960, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(960, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (1): InvertedResidual(
        (conv_pw): Conv2d(256, 1536, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536, bias=False)
        (bn2): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(1536, 64, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(64, 1536, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(1536, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (2): InvertedResidual(
        (conv_pw): Conv2d(256, 1536, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536, bias=False)
        (bn2): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(1536, 64, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(64, 1536, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(1536, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (3): InvertedResidual(
        (conv_pw): Conv2d(256, 1536, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536, bias=False)
        (bn2): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(1536, 64, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(64, 1536, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(1536, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (4): InvertedResidual(
        (conv_pw): Conv2d(256, 1536, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536, bias=False)
        (bn2): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(1536, 64, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(64, 1536, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(1536, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (5): InvertedResidual(
        (conv_pw): Conv2d(256, 1536, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536, bias=False)
        (bn2): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(1536, 64, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(64, 1536, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(1536, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (6): InvertedResidual(
        (conv_pw): Conv2d(256, 1536, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536, bias=False)
        (bn2): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(1536, 64, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(64, 1536, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(1536, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (7): InvertedResidual(
        (conv_pw): Conv2d(256, 1536, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536, bias=False)
        (bn2): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(1536, 64, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(64, 1536, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(1536, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (8): InvertedResidual(
        (conv_pw): Conv2d(256, 1536, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536, bias=False)
        (bn2): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(1536, 64, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(64, 1536, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(1536, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (9): InvertedResidual(
        (conv_pw): Conv2d(256, 1536, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536, bias=False)
        (bn2): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(1536, 64, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(64, 1536, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(1536, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (10): InvertedResidual(
        (conv_pw): Conv2d(256, 1536, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536, bias=False)
        (bn2): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(1536, 64, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(64, 1536, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(1536, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (11): InvertedResidual(
        (conv_pw): Conv2d(256, 1536, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536, bias=False)
        (bn2): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(1536, 64, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(64, 1536, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(1536, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (12): InvertedResidual(
        (conv_pw): Conv2d(256, 1536, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536, bias=False)
        (bn2): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(1536, 64, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(64, 1536, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(1536, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (13): InvertedResidual(
        (conv_pw): Conv2d(256, 1536, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536, bias=False)
        (bn2): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(1536, 64, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(64, 1536, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(1536, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
      (14): InvertedResidual(
        (conv_pw): Conv2d(256, 1536, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (conv_dw): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536, bias=False)
        (bn2): BatchNorm2d(1536, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (act2): SiLU(inplace=True)
        (se): SqueezeExcite(
          (conv_reduce): Conv2d(1536, 64, kernel_size=(1, 1), stride=(1, 1))
          (act1): SiLU(inplace=True)
          (conv_expand): Conv2d(64, 1536, kernel_size=(1, 1), stride=(1, 1))
          (gate): Sigmoid()
          (quant): QuantStub()
          (dequant): DeQuantStub()
        )
        (conv_pwl): Conv2d(1536, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
        (quant): QuantStub()
        (dequant): DeQuantStub()
      )
    )
    (conv_head): Conv2d(256, 1280, kernel_size=(1, 1), stride=(1, 1), bias=False)
    (bn2): BatchNorm2d(1280, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (act2): ReLU(inplace=True)
    (drop): Dropout(p=0.5, inplace=False)
    (global_pool): SelectAdaptivePool2d (pool_type=avg, flatten=Flatten(start_dim=1, end_dim=-1))
    (classifier): Linear(in_features=1280, out_features=2, bias=True)
    (quant): QuantStub()
    (dequant): DeQuantStub()
  )
)

Please ignore the order that the quant and dequant layers are shown here, in the forward pass of each block there are ordered correctly.

I have tried only quantizing parts of the model and have noticed that the largest performance decrease comes from the InvertedResidual blocks.

I suspect that this is due to how deep the network is and might require Quantization Aware Training, but I am not sure.

To quantize the model once it is generated, I run the following:

torch.quantization.prepare(model, inplace=True)
model.eval()
model.to('cpu')

I then pass a sample dataset through it to calibrate it, and run

model = torch.quantization.convert(model)

I know that in the code above I have not done any fusion of layers; however, the performance is still very poor with fusion so I am trying to resolve the issue without it.

Is there anything I am doing that might stand out as incorrect? Please let me know if there is any information missing here.

PyTorch version: 1.10.1
Torchvision version: 0.11.2

Thank you.

This matches some of our experiments showing that EfficientNet_B0 using torchvision’s published weights is not PTQ friendly - we saw a drop of top-1 on ImageNet1k from 77.7% for fp32 to 28.7% with the full model statically quantized to int8 with PTQ.

You could try using QAT to recover some of this accuracy.