Sorry if the question has been answered somewhere, I couldn’t find similar question across the forum so I would want to put my question here, and hope for your answer.
So we have a simple trained model, and applied the static quantization to get quantized model using ‘fbgemm’ as qconfig:
myModel.qconfig = torch.quantization.get_default_qconfig('fbgemm')
After this, we have quantized model with weights (int_repr()) exported.
I expect if I create a similar architecture, and import the int represented weight in, I can generate same result per layer as quantized model, but turn out the results are different.
Below is detailed flows:
#Notes: x_batch and x_quant were exported previously with quant model eval to pickle file and reload here for comparison
#Flow 1 #using x as input, calculate results through loaded quantized model #forward: x--> x_quant = self.quant(x) --> f = self.featExt(x_quant) # featExt definition: self.featExt = nn.Sequential(nn.Conv2d(1, 8, # kernel_size=5, stride=5, bias=False), nn.ReLU()) x_quant_new, f, x_conv, y_hat = quant_net.forward(x_batch) print('using saved quantized model: ') print('x_quant to compare(int): ', x_quant_new.int_repr()) print('filter to compare(int): ', quant_net.featExt.weight().int_repr()) print('output to compare(int): ', f.int_repr())
#Flow 2 #using x_quant as input, calculate conv 2d using pytorch function conv2d = nn.Conv2d(1, 8, kernel_size=5, stride=5, bias=False) conv2d.weight.data = my_debug_net.featConv.weight.data with torch.no_grad(): conv2d.eval() res1 = conv2d(x_quant.type(torch.CharTensor)) print('*********using F.conv2d***********') print('x_quant: ', x_quant) print('filter: ', conv2d.weight.data) print('F.conv2d Output ', res1) print('F.relu Output ', F.relu(res1))