The packing format of quantized parameters after jitting

Hi, following the static quantization tutorial,, I am trying to extract parameters of quantized, and jitted model. It seems after jitting, parameters are packed in a way that I don’t understand. For example, if I run the snippet below after the tutorial script, I get the output below.

input_size = (1, 3, 224, 224)
inp = np.random.randn(*input_size).astype("float32")
trace = torch.jit.trace(per_channel_quantized_model, torch.from_numpy(inp))
state_dict = trace.state_dict()
for (k, v) in state_dict.items():
    print(k, v.size())
features.0.0._packed_params torch.Size([128])
features.1.conv.0.0._packed_params torch.Size([128])
features.1.conv.1._packed_params torch.Size([128])
features.2.conv.0.0._packed_params torch.Size([128])
features.2.conv.1.0._packed_params torch.Size([128])
features.2.conv.2._packed_params torch.Size([128])
features.3.conv.0.0._packed_params torch.Size([128])
features.3.conv.1.0._packed_params torch.Size([128])
features.3.conv.2._packed_params torch.Size([128])
features.4.conv.0.0._packed_params torch.Size([128])
features.4.conv.1.0._packed_params torch.Size([128])
features.4.conv.2._packed_params torch.Size([128])
features.5.conv.0.0._packed_params torch.Size([128])
features.5.conv.1.0._packed_params torch.Size([128])
features.5.conv.2._packed_params torch.Size([128])
features.6.conv.0.0._packed_params torch.Size([128])
features.6.conv.1.0._packed_params torch.Size([128])
features.6.conv.2._packed_params torch.Size([128])
features.7.conv.0.0._packed_params torch.Size([128])
features.7.conv.1.0._packed_params torch.Size([128])
features.7.conv.2._packed_params torch.Size([128])
features.8.conv.0.0._packed_params torch.Size([128])
features.8.conv.1.0._packed_params torch.Size([128])
features.8.conv.2._packed_params torch.Size([128])
features.9.conv.0.0._packed_params torch.Size([128])
features.9.conv.1.0._packed_params torch.Size([128])
features.9.conv.2._packed_params torch.Size([128])
features.10.conv.0.0._packed_params torch.Size([128])
features.10.conv.1.0._packed_params torch.Size([128])
features.10.conv.2._packed_params torch.Size([128])
features.11.conv.0.0._packed_params torch.Size([128])
features.11.conv.1.0._packed_params torch.Size([128])
features.11.conv.2._packed_params torch.Size([128])
features.12.conv.0.0._packed_params torch.Size([128])
features.12.conv.1.0._packed_params torch.Size([128])
features.12.conv.2._packed_params torch.Size([128])
features.13.conv.0.0._packed_params torch.Size([128])
features.13.conv.1.0._packed_params torch.Size([128])
features.13.conv.2._packed_params torch.Size([128])
features.14.conv.0.0._packed_params torch.Size([128])
features.14.conv.1.0._packed_params torch.Size([128])
features.14.conv.2._packed_params torch.Size([128])
features.15.conv.0.0._packed_params torch.Size([128])
features.15.conv.1.0._packed_params torch.Size([128])
features.15.conv.2._packed_params torch.Size([128])
features.16.conv.0.0._packed_params torch.Size([128])
features.16.conv.1.0._packed_params torch.Size([128])
features.16.conv.2._packed_params torch.Size([128])
features.17.conv.0.0._packed_params torch.Size([128])
features.17.conv.1.0._packed_params torch.Size([128])
features.17.conv.2._packed_params torch.Size([128])
features.18.0._packed_params torch.Size([128])
quant.scale torch.Size([1])
quant.zero_point torch.Size([1])
classifier.1._packed_params._packed_params torch.Size([104])

I have no idea what is going on in this format and I have many questions. But for now let me ask you these:

  • Is there a documentation of the packing format?
  • How can I extract the original floating point tensors along with scale and zero point? I confirmed that they are available before tracing.
  • Or even better, is there a way to prevent packing?
  • During tracing, where in the code base does this packing happen?

I’m trying to translate jitted, quantized PyTorch model to TVM IR. For that I need floating point tensors with scale and zero point. That is the reason I’m asking here.

cc @raghuramank100 @jerryzh168

1 Like

ok torch.ops.quantized.conv2d_unpack did the job.

Hello, I met the same problem. Could you show me the detail of the “torch.ops.quantized.conv2d_unpack”? And how to deal with classifier.1._packed_params?
Thanks!

See the implementation in TVM I added:

From the name classifier.1._packed_params I guess it comes from nn.Linear. In that case, you need to use torch.ops.quantized.linear_unpack.

1 Like