Pytorch mobile for RAG applications

Looking to build a medical based offline RAG application for personal use on android or ios, would Pytorch mobile be feasible to use for this project

PyTorch Mobile can be a feasible option for a lightweight RAG setup, especially for personal use. The main thing to consider is model size and inference speed; mobile devices have limited resources, so you’ll likely need to quantize or distill your models. Also, keep in mind that offline RAG means you’ll need both the retriever and generator components to run locally, which can get a bit heavy. You need to look into using smaller models like DistilBERT or TinyBERT for retrieval and something compact like a distilled language model for generation. It’s doable, but it will take some careful optimization. If you’re exploring professional support for this kind of mobile AI integration, Idea Maker has experience building efficient, on-device AI solutions.

PyTorch team has been building ExecuTorch GitHub - pytorch/executorch: On-device AI across mobile, embedded and edge for PyTorch which replaced PyTorch mobile. It has better performance, support, features and has been deployed inside Meta on multiple products. We deprecated PyTorch mobile.