MobilenetV2 changes in model during quantization

Hi all, I need help in understanding some of the changes to the model during quantization.

Pre quantization setup:

  1. Relu6 converted to Relu
  2. Quantstub and Dequantstub added to the end of the model (after classifier)
    (quant): QuantStub()
    (dequant): DeQuantStub()
    After QAT and convert
  3. Batch norm layers seem to be removed
    First block of the model after preprocessing:
    (features): Sequential(
    (0): Conv2dNormActivation(
    (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
    (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU()
    First block of model after quantization
    (features): Sequential(
    (0): Conv2dNormActivation(
    (0): QuantizedConvReLU2d(3, 32, kernel_size=(3, 3), stride=(2, 2), scale=0.030749445781111717, zero_point=0, padding=(1, 1))
    (1): Identity()
    (2): Identity()

What would the reasoning be behind these changes?

For the removal of the batchnorm layers, if this is post-training quantization, one reason could be that batchnorm at inference time is doing a scale+shift which could be covered by the quantized layer so adding a separate scale + shift could be redundant.

That’s what I thought too! Any ideas on why relu6 gets converted to relu?