PyTorch and Caffe2 convergence?

vadimkantorov · July 25, 2018, 2:49pm

Lurking for some transparency on grand design decisions of PyTorch and Caffe2 teams

It seems Caffe2 is faster on introducing new neural net layers into the core, and there also seem no systematic benchmarks on speed of ops by two frameworks

Is there convergence planned for both PyTorch and Caffe2 provide the same ops running on the same backends?

Is PyTorch looking to generally port all main ops to native ATen ops?

What is the general idea of going forward for torch.script? Will it evolve into a general compiler / optimizer working on graphs or some industry-standardized IR (like what LLVM became to be; then support for future languages is easier, say Julia or even JavaScript) ?

Thanks!

justusschock · July 25, 2018, 3:09pm

Road to 1.0 should answer nearly all your questions.
For the benchmarks you may look at this, this or this link. They usually are not comparing single ops but to get a general overview it is still usable, even if they don’t use the current stable release. @smth does not even compare with Pytorch but with torch which should be equally fast as huge parts of the backends are/were the same (not sure what has been changed by migration to ATen).

vadimkantorov · July 25, 2018, 4:03pm

I did read the blog post before posting this, but thanks for the link though. Unfortunately, I could not understand specifics from the blog post, I am re-phrasing them below:

What does we decided to marry PyTorch and Caffe2 which gives the production-level readiness for PyTorch mean exactly?

Does it mean that more of backend ops from Caffe2 and PyTorch are going to converge? Yes, from the blog post it is clear that the export and interchange is going to be easier, but I could not understand if more convergence is planned (at least for NVidia GPU / CPU ops).

For torch.script - yes, from the blog post it is clear it helps to optimize PyTorch (or imported models), but does it aim a larger place in the ecosystem (maybe even separated from PyTorch itself?)?

justusschock · July 25, 2018, 5:16pm

I read on github, that there is a new backend called C10 in progress which combines features and backends from ATen and Caffe2. This backend should be a more generic one which means that adding new tensor types and similar stuff will be easier (the actual discussion was about introducing complex tensors).

For me it reads like this will become a larger ecosystem (together with everything else from the jit-part) and the traced models will will be python- (and thus also pytorch-independent) but still have strong backend-dependencies (which will most likely be C10 at that time).