I expect that most people are using ONNX to transfer trained models from Pytorch to Caffe2 because they want to deploy their model as part of a C/C++ project. However, there are no examples which show how to do this from beginning to end.
From the Pytorch documentation here, I understand how to convert a Pytorch model to ONNX format using torch.onnx.export, and also how to load that file into Caffe2 using onnx.load + onnx_caffe2.backend… but that’s just the python side of things.
I also understand something about loading pretrained Caffe2 models into a C++ project from .pb files as described in the Caffe2 C++ tutorials here. (Click on the pretrained.cc link.)
What I’m missing are the steps in between. How do I take the output from onnx_caffe2.backend and create the .pb files found in the Caffe2 C++ tutorials? Maybe these steps are obvious to a seasoned Caffe2 user, but this is my first exposure. A step-by-step recipe would probably help a lot of Pytorch users.
Actually, with ONNX-Caffe2 package, you can easily turn an ONNX model to a Caffe2 model, then dump it into pb files.
Here is an example:
import onnx
from onnx_caffe2.backend import Caffe2Backend
onnx_proto_file = "/onnx.proto"
torch.onnx.export(G, x, onnx_proto_file, verbose=True)
onnx_model = onnx.load(onnx_proto_file)
init_net, predict_net = Caffe2Backend.onnx_graph_to_caffe2_net(onnx_model.graph)
with open("onnx-init.pb", "wb") as f:
f.write(init_net.SerializeToString())
with open("onnx-init.pbtxt", "w") as f:
f.write(str(init_net))
with open(, "onnx-predict.pb", "wb") as f:
f.write(predict_net.SerializeToString())
with open("onnx-predict.pbtxt", "w") as f:
f.write(str(predict_net))
Jin, could you elaborate? What is the 1. Current best option for deploying python trained models to a high performance c++ runtime (ideally supporting accelerators). 2. What is the option the pytorch team will recommend in the future (6-12 months)? I.e. where are we now? Where will we be in 6-12 months? Will onnx be replaced by libtorch?
I think currently the best way deploy python trained model depends on your target platform. If you wanna using GPU, fastest way is TensorRT, if you got a CPU, using some QNNPACK instead;
Pytorch team should encourage us using exporting to onnx for the deployment.
Hello everyone,
I must confess I am a total newbie to pytorch framework and neural networks in general. I have been doing some reading and practice for about 2 months now. I am working on my thesis and I need to implement some classification at some point. I currently have a pretrained model in caffe and would like to convert it to c++ version of pytorch since I am a bit comfortable with pytorch now.
I would be glad if someone could direct me to a reading resource/library/tutorial that does this: caffe2–>C++ torch. Most of what I have found is loading from caffe to pytorch in python not c++.
secondly, should I succeed in the convertion will the new model in torch still be a trained model or I woud I have to retain.
I’ll be glad if someone could give me a prompt response as I am running out of time. Thanks in advance. @jinfagang@houseroad
For TensorRT, with the improvements in the ONNX exporter, the experience should be better now. We can ping Nvidia folks if there are any issues: https://github.com/onnx/onnx-tensorrt
Generally speaking, for backend performance libraries (relating to the QNNPACK comment above), the following is guidance depending on the target platform:
Server CPU (x86) - MKL/DNN and FBGEMM(8bit integer quantized)
ARM/Neon CPU - QNNPACK (8b integer quantized, ready today) and XNNPACK (current WIP for integration)