Deployment of PyTorch model in production

I have a couple of questions related to using PyTorch models in production

  1. How can we deploy PyTorch models in a stateless server on-cloud?

  2. What’s the best practice to deploy a model offline to do inference on phone? I mean I know about the tracing and direct embedding method but from a security perspective is it safe to roll out an APK by tracing? By safety I mean can someone reverse engineer the weights of my model from the APK?

I’d suggest taking a look at torchserve for model serving. Not sure if it’s possible to do stateless though, I assume you mean something like a Lambda function?