News: NNVM / TVM Compiler Stack to port PyTorch everywhere

FuriouslyCurious · October 7, 2017, 5:37am

New announcement from UW and Amazon:

mratsim · October 7, 2017, 6:35am

It feels like over-engineering to me and I am skeptical about this for several reasons:

2.2x perf improvements with LLVM on CPU vs hand-tuned OpenBLAS and NNPACK? Even D Mir GLAS only claims MKL-like performance
How does the compiler deal with dynamic graph/variable sized input.

It seems like the framework is for production but if not:

How often do we need to compile? That was already a pain point in Theano.
What’s the build toolchain like? Bazel for Tensorflow is already a huge pain (especially in containers) and this seems even more complicated.
I see Jenkins/Travis tests but what is the coverage for ONNX?

(Note that i’m asking OP to answer, I’m just raising concerns)

Edit: Would be interested to know what did they choose as default layout: CHWN (Neon), NCHW (Caffe / Torch), NHWC (Tensorflow) as it has implications in convolutions performance and algorithm choice.

FuriouslyCurious · October 7, 2017, 7:50pm

Yeah I agree with your concerns @mratsim : 5 layers of compilers would be a nightmare to debug. This stack can be useful in deploying a really well-tested model, but will be hard to use it for research or product development setting.

How does the compiler deal with dynamic graph/variable sized input.

ONNX translates dynamic graphs to static based on most-traversed path, so rest of the tool chain would work off of the static graph.
Source: Log into Facebook