Deploy PyTorch 1.0 into production: Flask (Python) vs NodeJS (C++ Addons)

rchavezj · September 19, 2019, 11:39pm

I notice people in the PyTorch-Libtorch community have started using NodeJS C++ Addons to wrap Libtorch code into production. Is this the most appropriate pipeline to deploy a C++ Libtorch API model from a backend standpoint to retrieve an API endpoint? I’m a fan of backend inference performance.

Most people lately have been using flask but when it comes to speed python is slower than C++ including javascript (NodeJS) since it’s running on Chrome web browser (v8-C++ compiler).

jspisak · October 17, 2019, 2:05pm

Agree, for folks where perf isn’t absolutely critical (which is a lot!), Flask is a great and easy option.

The Amazon guys are putting forth an RFC for a high performance Cpp PyTorch serving platform. Would be great to get your thoughts on the RFC:

Cheers!

tom · October 31, 2019, 3:32pm

I think there is plenty of room to improve on the standard “use flask” advice and the solutions they present. Without leaving Python, one can do many things right (like request batching, using a framework with async etc.). Christian also has a great slide deck that lists some desiderata.

I think it is hard to completely avoid copying if you want your server to decode images (because you have request -> JPEG format blob -> Tensor -> Rescaled), but many of the other items from Christian’s list can be handled quite nicely using Python + traced models (and gRPC could probably be used to eliminate some of the copying headaches). The JIT liberates you from annoyances with the GIL during the running of your model.

Best regards

Thomas