What is the need of fusing layers in MobileNetv2?

I have seen the static quantization tutorial, where the layers are fused before itself. I got good result with fused layers but if I don’t fuse the layers, My accuracy is very poor.

What is the effect of layer fusion?

Please do help me with this.

layer fusion is going to fuse Conv+BN into a Conv module or Conv + BN + ReLU into a ConvRelu module. this does not change numerics itself. Without fusion conv, bn and relu will be quantized independently, that might be the reason why the accuracy drops.

But, what is the drawback of quantizing convolution, batchnorm, relu operations independently?

quantizing them independently will have worse performance, and also may suffer from bigger accuracy loss.