Hi forum,
I’m basically completely new to C++. I would like to deploy a pytorch-trained model as a continuously running web service in production. It should accept http request with one or more images attached and it will respond with inferenced results. If this demand shall be fulfilled by python, I would take Django/Flask as web service framework but I would like to dive into C++ runtime environment for less inference latency. Can anyone share me some rough ideas on what should I do to achieve the same goal? Like should I search for a similar web service framework written in C++ or should I head for another direction?
Any ideas would be appreciated:)