Glow with generic model builder

wny · April 14, 2019, 5:50pm

There are plenty of examples for using the image-classifier builder, but I have not found any examples on using the generic model builder. In particular:

What would be the input format to the generic builder?
I would guess that I would have to write my own input handler. Where would I do that?
Relating to 2, is there a high level class that I can subclass from to create my own builder?

Ultimately, is there a barebones example somewhere?

Thanks!

wny · April 15, 2019, 8:03pm

Hate answering my own question, but here goes… (I still have more questions below, though):

From Glow’s tools/ModelRunner.cpp:

int main(int argc, char **argv) {
  PlaceholderBindings bindings;
  // The loader verifies/initializes command line parameters, and initializes
  // the ExecutionEngine and Function.
  Loader loader(argc, argv);

  // Create the model based on the input net, and get SaveNode for the output.
  std::unique_ptr<ProtobufLoader> LD;
  if (!loader.getCaffe2NetDescFilename().empty()) {
    LD.reset(new Caffe2ModelLoader(loader.getCaffe2NetDescFilename(),
                                   loader.getCaffe2NetWeightFilename(), {}, {},
                                   *loader.getFunction()));
  } else {
    LD.reset(new ONNXModelLoader(loader.getOnnxModelFilename(), {}, {},
                                 *loader.getFunction()));
  }
  Placeholder *output = EXIT_ON_ERR(LD->getSingleOutput());
  auto *outputT = bindings.allocate(output);

  // Compile the model, and perform quantization/emit a bundle/dump debug info
  // if requested from command line.
  loader.compile(bindings);

  // If in bundle mode, do not run inference.
  if (!emittingBundle()) {
    loader.runInference(bindings);

    llvm::outs() << "Model: " << loader.getFunction()->getName() << "\n";

    // Print out the result of output operator.
    outputT->getHandle().dump();

    // If profiling, generate and serialize the quantization infos now that we
    // have run inference to gather the profile.
    if (profilingGraph()) {
      loader.generateAndSerializeQuantizationInfos(bindings);
    }
  }

  return 0;
}

It looks like the generic model-runner does not accept input. Compare this with tools/ImageClassifier.cpp. We can probably modify it to accept custom input, or build on top of model-runner to create custom builders. My question, however, is this:

Since model-runner does not accept input, how does it perform quantization (if asked) and other optimizations?

In particular, documentation talks about the backend “watching” network activation as data flows through it, and makes decisions about which optimizations to make and how to correctly quantize the network.

jfix · April 16, 2019, 5:03pm

Correct – ModelRunner does not accept input. It assumes a model with no inputs and a single output.

I would guess that I would have to write my own input handler. Where would I do that?

Relating to 2, is there a high level class that I can subclass from to create my own builder?
Ultimately, is there a barebones example somewhere?

You’d need to create your own version that uses a specific number of inputs/outputs given your use case, as well as load the inputs yourself into the input tensors. I think the most barebones example we have is ModelRunner. You then could look to ImageClassifier to observe how we are loading in inputs and outputs. TextTranslator also exhibits multiple inputs and outputs for a model. I.e. for each, you can take a look at calls to create Caffe2ModelLoader/ONNXModelLoader to see how to use multiple inputs/outputs, how they’re used later, etc.

Since model-runner does not accept input, how does it perform quantization (if asked) and other optimizations?

Good question – it would only be able to quantize given the Constant inputs already specified. However, the ModelRunner is really there as a simple tester for operator support, and is not used for serious models where you would want to gather a profile across many different inputs and then test accuracy across some test set.