GFPGAN-Quantization

Hi ,
I have been trying to quantize a GFPGAN model, to increase the inference speed.
here is my code for same[ after i saw the official pytorch Youtube Video and read all documentations - seems like i am missing something critical as :slightly_smiling_face:
repo link: GFPGAN/gfpgan at master 路 TencentARC/GFPGAN 路 GitHub

1- My quant model size is > original model size.
my code here:

import cv2
import glob
import numpy as np
import os
import torch
from basicsr.utils import imwrite

from gfpgan import GFPGANer
import copy

import time
from time import perf_counter
from gfpgan.archs.gfpganv1_clean_arch import GFPGANv1Clean
from torch import nn
# Hello, I am trying to quantize a model. I have done post training static quantization following the tutorial. During the conversion, I:


#get the older model                         #ok
mymodel = GFPGANv1Clean(
                out_size=512,
                num_style_feat=512,
                channel_multiplier=2,
                decoder_load_path=None,
                fix_decoder=False,
                num_mlp=8,
                input_is_latent=True,
                different_w=True,
                narrow=1,
                sft_half=True)

#load the state_dict
mymodel.load_state_dict(torch.load('C:/Users/Risin/Desktop/GFPGAN/GFPGANv1.3.pth')['params_ema']) #k

#set to eval()

mymodel.eval() #ok

backend = "fbgemm"
"""Insert stubs"""
m = nn.Sequential(torch.quantization.QuantStub(), 
                  mymodel, 
                  torch.quantization.DeQuantStub())

"""Prepare"""
mymodel.qconfig = torch.quantization.get_default_qconfig(backend)
qmodel =torch.quantization.prepare(mymodel, inplace=False)

"""Calibrate
- This example uses random data for convenience. Use representative (validation) data instead.
"""
qmodel.eval()
with torch.inference_mode():
    

    for _ in range(10):
        x = torch.rand(1,3,512, 512)
        m(x)
    
"""Convert"""
torch.quantization.convert(qmodel)
torch.save(qmodel.state_dict(),'gulab_abhishek.pt')

in another approach i used deepcopy and code used is mentioned below:

gfpgan2 = copy.deepcopy(gfpgan)

#check model layer
from torchvision import models
from torchsummary import summary
print(dir(gfpgan2))
# backend = "fbgemm"
# """Insert stubs"""
# m = nn.Sequential(torch.quantization.QuantStub(), 
#                   gfpgan2, 
#                   torch.quantization.DeQuantStub())

# """Prepare"""
# gfpgan2.qconfig = torch.quantization.get_default_qconfig(backend)
# torch.quantization.prepare(gfpgan2, inplace=True)

# """Calibrate
# - This example uses random data for convenience. Use representative (validation) data instead.
# """
# with torch.inference_mode():
#   for _ in range(10):
#     x = torch.rand(1,3,512, 512)
#     m(x)
    
# """Convert"""
# torch.quantization.convert(gfpgan2, inplace=True)
torch.save(gfpgan2.state_dict(), 'quantized_modelv2.pth')

2- in second method qmodel size is nearly 1/2 of the original model size, but here i get another error in inferencing,
i get missing keys errors or i just can't load model.

 File "C:\Users\Risin\Desktop\GFPGAN\inference_TDI.py", line 19, in <module>
    restorer = GFPGANer(
  File "C:\Users\Risin\Desktop\GFPGAN\gfpgan\utils.py", line 102, in __init__
    self.gfpgan.load_state_dict(torch.load("C:/Users/Risin/Desktop/GFPGAN/quantized_modelv2.pth")['params_ema'])
KeyError: 'params_ema'

sorry for the late reply, does saving and loading state_dict for gfpgan model it work without quantization?