How to reduce the deployment footprint

Vishal_Ahuja · February 20, 2020, 7:22am

Hi All,

This is the first time I am going to deploy a model in production(AWS), so need some help. The AWS server will not have a GPU and our inference code will be in python - we are not worried about speed right now. The main requirement is to keep the size of the instance minimal. Considering this, can someone please tell me how to install pytorch (CPU only) without having to install all the extra packages that come with anaconda? Any other tips will be very appreciated; I am new to this space.

Sincerely,
Vishal

marksaroufim · February 10, 2022, 9:54pm

If you’re interested you can use torchserve which will allow you to deploy a pytorch model without needing to install pytorch https://github.com/pytorch/serve

It’s also the default way in which sagemaker models are served so if you’re already using that things should be easy

An alternative would be loading a torchscript model in c++ like so TorchScript for Deployment — PyTorch Tutorials 1.10.1+cu102 documentation