How to reduce the deployment footprint

Hi All,

This is the first time I am going to deploy a model in production(AWS), so need some help. The AWS server will not have a GPU and our inference code will be in python - we are not worried about speed right now. The main requirement is to keep the size of the instance minimal. Considering this, can someone please tell me how to install pytorch (CPU only) without having to install all the extra packages that come with anaconda? Any other tips will be very appreciated; I am new to this space.


If you’re interested you can use torchserve which will allow you to deploy a pytorch model without needing to install pytorch

It’s also the default way in which sagemaker models are served so if you’re already using that things should be easy

An alternative would be loading a torchscript model in c++ like so TorchScript for Deployment — PyTorch Tutorials 1.10.1+cu102 documentation