How to quantize a Swin transformer model to reduce its size?


I’m a beginner and I have a project where I need to quantize models using pytorch. I managed to find a way to quantize a resnet model but it seems like it doesn’t work on a Swin Tranformer model since this model has a different architecture. Has anyone any ideas on how to quantize a Swin transformer model?

Can you give more context, s.a. links, codes to reproduce, etc.?

Basically, I was trying to apply the quantization aware training technique on a swin transformer model(Inspired by this work: PyTorch Quantization Aware Training - Lei Mao's Log Book ) since it gives better accuracy than other techniques. However, Swin transformer has totally a different architecture, so it doesn’t work obviously on a Swin model.

are you using eager mode quantization? have you tried fx graph mode quantization: (prototype) FX Graph Mode Post Training Static Quantization — PyTorch Tutorials 1.12.0+cu102 documentation?

also please take a look at api summary: Quantization — PyTorch 1.12 documentation