Sorry if the question has been answered somewhere, I couldn’t find similar question across the forum so I would want to put my question here, and hope for your answer.
So we have a simple trained model, and applied the static quantization to get quantized model using ‘fbgemm’ as qconfig:
myModel.qconfig = torch.quantization.get_default_qconfig('fbgemm')
After this, we have quantized model with weights (int_repr()) exported.
I expect if I create a similar architecture, and import the int represented weight in, I can generate same result per layer as quantized model, but turn out the results are different.
Below is detailed flows:
#Notes: x_batch and x_quant were exported previously with quant model eval to pickle file and reload here for comparison
#Flow 1
#using x as input, calculate results through loaded quantized model
#forward: x--> x_quant = self.quant(x) --> f = self.featExt(x_quant)
# featExt definition: self.featExt = nn.Sequential(nn.Conv2d(1, 8,
# kernel_size=5, stride=5, bias=False), nn.ReLU())
x_quant_new, f, x_conv, y_hat = quant_net.forward(x_batch[0])
print('using saved quantized model: ')
print('x_quant to compare(int): ', x_quant_new.int_repr())
print('filter to compare(int): ', quant_net.featExt[0].weight().int_repr())
print('output to compare(int): ', f.int_repr())
#Flow 2
#using x_quant as input, calculate conv 2d using pytorch function
conv2d = nn.Conv2d(1, 8, kernel_size=5, stride=5, bias=False)
conv2d.weight.data = my_debug_net.featConv.weight.data
with torch.no_grad():
conv2d.eval()
res1 = conv2d(x_quant[0].type(torch.CharTensor))
print('*********using F.conv2d***********')
print('x_quant: ', x_quant[0])
print('filter: ', conv2d.weight.data)
print('F.conv2d Output ', res1)
print('F.relu Output ', F.relu(res1))