Looking at the model def you posted, it looks like it is not yet quantized. One missing thing is calibration. You can add a calibration step after you call prepare and before you call convert:
torch.quantization.prepare(fused_model, inplace=True)
# calibrate your model by feeding it example inputs
for inputs in your_dataset:
fused_model(inputs)
print('Quantized model Size:')
quantized = torch.quantization.convert(fused_model, inplace=False)
print_size_of_model(quantized)
in the paste here (https://github.com/raghavgurbaxani/Quantization_Experiments/blob/master/quantized_model.txt), the model doesn’t look quantized. One would expect to see QuantizedConv instead of Conv and QuantizedLinear instead of Linear. One thing to try could be to make sure to run the convert script and ensure that you see the quantized module equivalents afterwards.
I suspect maybe my quant and dequant stub may be incorrect but apart from that I’ve followed all the steps as posted in the static quantization tutorial.
Hey @Vasiliy_Kuznetsov! I am also experiencing a similar error, but only when quantising torch.nn.GRU with this script:
import torch.nn as nn
from torch.ao.quantization.qconfig_mapping import QConfigMapping
import torch.quantization.quantize_fx as quantize_fx
import copy
class UserModule(nn.Module):
def __init__(self):
super().__init__()
self.l = nn.GRU(128, 128, 128, batch_first=True, bidirectional=True)
def forward(self, x):
return self.l(x)
model_fp = UserModule()
model_to_quantize = copy.deepcopy(model_fp)
model_to_quantize.eval()
qconfig_mapping = QConfigMapping().set_global(torch.quantization.default_dynamic_qconfig)
# a tuple of one or more example inputs are needed to trace the model
model_prepared = quantize_fx.prepare_fx(model_to_quantize, qconfig_mapping, None)
model_quantized = quantize_fx.convert_fx(model_prepared)
def print_size_of_model(model):
import os
torch.save(model.state_dict(), "temp.p")
print('Size (MB):', os.path.getsize("temp.p")/1e6)
os.remove('temp.p')
print_size_of_model(model_fp)
print_size_of_model(model_quantized)