I found that there is no third party for serving. Like tf serving server.
So I searched how pytorch users serve model and there are some ways like this
- pytoch model -> onnx -> tensorrt inference server
- pytorch model -> jit -> tensorrt inference server
- pytoch model -> load in python -> serve with flask etc.
and so on
I wonder what is the common or preferred way to serving pytorch model.