Hi I am working on a quantized model in C++. I have trained and quantized the model in Python and loaded to C++ (post training quantization). I wonder if I can parse the jitted model parameters (torchscript format) in C++ ? I could not find any layer-unpacking modules in torch::jit::script::Module .
After loading the model, I can dump the scriptModule modules and parameters using (torch::jit::script::Module) m->dump():
--------------------------------------------------------------------------------------------------------------------
dumping module module __torch__.Net {
parameters {
}
attributes {
training = False
(Here->) fc1 = <__torch__.torch.nn.intrinsic.quantized.modules.linear_relu.LinearReLU object at 0x5555566b75f0>
Relu1 = <__torch__.torch.nn.modules.linear.Identity object at 0x5555566b3d40>
fc2 = <__torch__.torch.nn.intrinsic.quantized.modules.linear_relu.LinearReLU object at 0x5555566a7030>
Relu2 = <__torch__.torch.nn.modules.linear.Identity object at 0x5555567b1440>
droput2 = <__torch__.torch.nn.modules.dropout.Dropout object at 0x5555567b1640>
fc3 = <__torch__.torch.nn.quantized.modules.linear.Linear object at 0x5555567b22c0>
quant = <__torch__.torch.nn.quantized.modules.Quantize object at 0x5555567b2ad0>
dequant = <__torch__.torch.nn.quantized.modules.DeQuantize object at 0x5555567b8350>
logMax = <__torch__.torch.nn.modules.activation.LogSoftmax object at 0x5555567b87c0>
}
methods {
method forward {
graph(%self.1 : __torch__.Net,
%x.1 : Tensor):
%7 : int = prim::Constant[value=-1]() # ~//MNIST_PyTorch_Quantize.py:50:19
%8 : int = prim::Constant[value=784]() # ~//MNIST_PyTorch_Quantize.py:50:23
%3 : __torch__.torch.nn.quantized.modules.Quantize = prim::GetAttr[name="quant"](%self.1)
%x0.1 : Tensor = prim::CallMethod[name="forward"](%3, %x.1) # :0:0
%9 : int[] = prim::ListConstruct(%7, %8)
%x1.1 : Tensor = aten::view(%x0.1, %9) # ~//MNIST_PyTorch_Quantize.py:50:12
%12 : __torch__.torch.nn.intrinsic.quantized.modules.linear_relu.LinearReLU = prim::GetAttr[name="fc1"](%self.1)
%x2.1 : Tensor = prim::CallMethod[name="forward"](%12, %x1.1) # :0:0
%16 : __torch__.torch.nn.modules.linear.Identity = prim::GetAttr[name="Relu1"](%self.1)
%x3.1 : Tensor = prim::CallMethod[name="forward"](%16, %x2.1) # :0:0
%20 : __torch__.torch.nn.intrinsic.quantized.modules.linear_relu.LinearReLU = prim::GetAttr[name="fc2"](%self.1)
%x4.1 : Tensor = prim::CallMethod[name="forward"](%20, %x3.1) # :0:0
%24 : __torch__.torch.nn.modules.linear.Identity = prim::GetAttr[name="Relu2"](%self.1)
%x5.1 : Tensor = prim::CallMethod[name="forward"](%24, %x4.1) # :0:0
%28 : __torch__.torch.nn.modules.dropout.Dropout = prim::GetAttr[name="droput2"](%self.1)
%x6.1 : Tensor = prim::CallMethod[name="forward"](%28, %x5.1) # :0:0
%32 : __torch__.torch.nn.quantized.modules.linear.Linear = prim::GetAttr[name="fc3"](%self.1)
%x7.1 : Tensor = prim::CallMethod[name="forward"](%32, %x6.1) # :0:0
%36 : __torch__.torch.nn.quantized.modules.DeQuantize = prim::GetAttr[name="dequant"](%self.1)
%x8.1 : Tensor = prim::CallMethod[name="forward"](%36, %x7.1) # :0:0
%40 : __torch__.torch.nn.modules.activation.LogSoftmax = prim::GetAttr[name="logMax"](%self.1)
%42 : Tensor = prim::CallMethod[name="forward"](%40, %x8.1) # :0:0
return (%42)
}
}
submodules {
module __torch__.torch.nn.intrinsic.quantized.modules.linear_relu.LinearReLU {
parameters {
}
attributes {
training = False
in_features = 784
out_features = 512
scale = 0.048926487565040588
zero_point = 0
_packed_params = <__torch__.torch.nn.quantized.modules.linear.LinearPackedParams object at 0x5555566bcc40>
}
methods {
method forward {
graph(%self.1 : __torch__.torch.nn.intrinsic.quantized.modules.linear_relu.LinearReLU,
%input.1 : Tensor):
%4 : __torch__.torch.nn.quantized.modules.linear.LinearPackedParams = prim::GetAttr[name="_packed_params"](%self.1)
%5 : Tensor = prim::GetAttr[name="_packed_params"](%4)
%7 : float = prim::GetAttr[name="scale"](%self.1)
%9 : int = prim::GetAttr[name="zero_point"](%self.1)
%Y_q.1 : Tensor = quantized::linear_relu(%input.1, %5, %7, %9) # ~/python3.7/site-packages/torch/nn/intrinsic/quantized/modules/linear_relu.py:29:14
return (%Y_q.1)
}
}
submodules {
(+) module __torch__.torch.nn.quantized.modules.linear.LinearPackedParams {
}
}
module __torch__.torch.nn.modules.linear.Identity {} (+)
module __torch__.torch.nn.intrinsic.quantized.modules.linear_relu.LinearReLU {} (+)
module __torch__.torch.nn.modules.linear.Identity {} (+)
module __torch__.torch.nn.modules.dropout.Dropout {} (+)
module __torch__.torch.nn.quantized.modules.linear.Linear {} (+)
module __torch__.torch.nn.quantized.modules.Quantize {} (+)
module __torch__.torch.nn.quantized.modules.DeQuantize {} (+)
module __torch__.torch.nn.modules.activation.LogSoftmax {} (+)
} // end of submodules
} // end of dumping module module __torch__.Net
--------------------------------------------------------------------------------------------------------
Notes: (+) means there are collapsed lines omitted to save space.
Torch version 1.6.0+.
My Questions:
1- During model load, are the module layers packed again to a format like in torch::nn::Linear and torch::nn::Conv1d … etc? how can I access them?
2- Are the pointers printed in the dump (like the line marked with “(Here->)” ) for Python objects and methods or they are C++ objects and methods? are they cast-able to the formats in torch::nn:* ?
3- What is the recommended procedure to structure the model again in terms of the number of layers , the attributes\configuration of each layer and the corresponding trained weights from the jitted\torchscript format ?