No tensor registered with name

#1

From model-runner I get the error ProtobufLoader.cpp line: 33 message: There is no tensor registered with name input_data. My graph (ONNX) looks like this:

graph(%input_data : Float(1, 2)
      %1 : Float(8, 2)
      %2 : Float(8)
      %3 : Float(8, 8)
      %4 : Float(8)
      %5 : Float(8, 8)
      %6 : Float(8)
      %7 : Float(1, 8)
      %8 : Float(1)) {
  %9 : Float(1, 8) = onnx::Gemm[alpha=1, beta=1, transB=1](%input_data, %1, %2), scope: FeedForwardNN/Linear
  %10 : Float(1, 8) = onnx::Relu(%9), scope: FeedForwardNN/ReLU
  %11 : Float(1, 8) = onnx::Gemm[alpha=1, beta=1, transB=1](%10, %3, %4), scope: FeedForwardNN/ReLU
  %12 : Float(1, 8) = onnx::Relu(%11), scope: FeedForwardNN/ReLU
  %13 : Float(1, 8) = onnx::Gemm[alpha=1, beta=1, transB=1](%12, %5, %6), scope: FeedForwardNN/ReLU
  %14 : Float(1, 8) = onnx::Relu(%13), scope: FeedForwardNN/ReLU
  %15 : Float(1, 1) = onnx::Gemm[alpha=1, beta=1, transB=1](%14, %7, %8), scope: FeedForwardNN/ReLU
  %output : Float(1, 1) = onnx::Sigmoid(%15), scope: FeedForwardNN/Sigmoid
  return (%output);
}

The command line looks like this:

/bin $ ./model-runner -model=./ffnn.onnx -network-name="ffnn" -cpu -emit-bundle=./ -verbose

I have confirmed that the graph (ONNX) is constructed correctly by importing it in Python, then using the caffe2 backend to run inference.

I have also tried creating the init_net.pb and predict_net.pb, but I get the same error when specifying *.pb instead of *.onnx

I am guessing this has something to do with not specifying the input for model-runner (the way we do, for instance, with image-classifier). But then again, there is no way to specify input for model-runner.

Any ideas would be appreciated.

Thanks!

(Jordan Fix) #2

I am guessing this has something to do with not specifying the input for model-runner (the way we do, for instance, with image-classifier ). But then again, there is no way to specify input for model-runner .

Yeah sorry for the confusion – you’ve diagnosed the problem correctly. As I noted in my other reply in the other thread, the ModelRunner is more of a toy, and your model must not have any inputs, and exactly one output.

I would suggest creating your own model loader/runner, perhaps initially based on ModelRunner since it’s the simplest, which is customized for your model. It would have a single input with name input_data and a single output named output. You can look at ImageClassifier to see how it creates its Caffe2ModelLoader/ONNXModelLoader with inputName along with a Tensor Type inputImageType, and then calls updateInputPlaceholders() to update the Tensor for the input before running.

#3

Got it. Thanks! It sounds to me like one could write a base class that clients could subclass from, and override some methods to create custom inputs, to build their own custom models. This can then be made part of the Glow distribution.

(Jordan Fix) #4

Yeah this was partially the purpose of the Loader class which ModelRunner/ImageClassifier/TextTranslator derive from. Perhaps more things could be pulled into Loader, or into Caffe2ModelLoader/ONNXModelLoader. If you have any ideas we always welcome PRs :slight_smile:

#5

Yep, I think there is quite a bit of glue logic that can be encapsulated away. I may give it some thought later, once I have more experience with glow.

A bit off topic: my impression is that currently there is really only the cpu backend; is this correct? In that case, does Glow perform optimizations like taking advantage of CPU SIMD architecture? In particular quantization would make the network amenable to that sort of optimization it seems. If so, can we verify those optimizations by dumping low level IR or even LLVM asm output (the latter would probably be more appropriate in this case).

(Jordan Fix) #6

A bit off topic: my impression is that currently there is really only the cpu backend; is this correct?

We have an OpenCL backend that is also under development, but it hasn’t had a ton of work done recently. We additionally have the Habana backend, but that is only useful if you have a Habana accelerator :slight_smile:. Lastly we have the Interpreter backend, but it’s intended as a reference implementation and not super performant.

In that case, does Glow perform optimizations like taking advantage of CPU SIMD architecture? In particular quantization would make the network amenable to that sort of optimization it seems. If so, can we verify those optimizations by dumping low level IR or even LLVM asm output (the latter would probably be more appropriate in this case).

Yes, the CPU backend supports vectorized implementations for most of our op kernels. This is not visible in Glow’s low-level IR, but it is visible in LLVM IR and asm. There are two useful command-line options to use here: -dump-llvm-ir and -dump-llvm-asm.