Hi,
I am interested in implementing layer-wise quantization in glow.
I have looked through the issue just about this topic ([Quantization] layer-wise quantization · Issue #4982 · pytorch/glow · GitHub), but it seems to be left untouched.
My question is, given node list of type glow::Node
(or node name list e.g. Conv_conv_1__2
) in a model, and each node has its quantization options respectively (for example, these options can be enableChannelwiseOpt
, quantizationCalibrationOpt
, keepOriginalPrecisionForNodesOpt
, …), how do we incorporate these info to the code? Below I will explain a little more backgrounds.
Model profiling or model compiling, whichever it is, glow takes roughly three steps for the model loader:
- load the model (
loader.loadModel()
) - get compilation settings (
loader.getCompilationContext(QuantizationMode)
) - compile the model using the
CompilationContext
config (loader.compile(cctx)
)
For simplicity, I think the first thing to do is load and add quantization options to all of the nodes in a model, in 2.
. To contain these info, we can add a map to precConfig
of type PrecisionConfiguration
. For example, the map’s key is the node names, and the value is also a map, with the key quantization options and the value the actual preferences. This is done by modifying Loader::getCompilationContext
function in Loader.cpp
.
Then, modify transformForPrecisionMode
function in GraphOptimizer.cpp
, as mentioned at the issue link above. This function is called in optimizeFunction
in the same file, which is called in Loader::compile
function in step 3.
(I suppose).
In transformForPrecisionMode
, if we conduct profiling, profileQuantization
function in GraphOptimizer/Quantization.cpp
will be called. Or if we want to execute quantization, quantizeFunction
function in quantization/Quantization.cpp
will be called.
Either way, I am not sure exactly where to relate the above layer-wise config to these functions. I don’t see modification points in the former case (profiling), because profiling nodes themselves being inserted are not related to configs like calibration method and precision selection, and these are for executing quantization. The latter, however, since the quantizeFunction
function creates an FunctionQuantizer
instance inside it, I know there are some member functions to be modified in that class. Although I looked at the class, it"s a little complex and I didn’t manage to find where to implement.
Also, glow::lower
function in Lower.cpp
needs some expansion, since KindSet
doNotLowerKinds
is included there. What do you think?
To sum up, it is not clear to me that what functions called in/after transformForPrecisionMode
should be modified, and how. Additionally if any functions other than transformForPrecisionMode
needs to be changed, I would like to know what they are.
Any hints or thoughts?