Advice on using executorch on iOS

ericbomgardner · November 11, 2024, 4:44am

I’m trying to use executorch to run a Llama 3.2 quantized 3B QAT+LoRA model in an iOS app. The demo app works with the .pte file I generated for that model, but then when I went to try to add executorch into my app, it seems like a lot of the code that the demo app uses isn’t available through the executorch Swift package – the demo app uses a bunch of stuff in executorch/extension/llm, and as far as I can tell, the package just exposes executorch/extension/module/, executorch/extension/tensor/, and executorch/runtime/* (but I could be wrong, I’m not super-familiar with C++!).

Is there a way you’d recommend to use that code from executorch/extension/llm in my app? Or is that outside of the scope of the executorch library?

So far I’ve tried:

copying the files directly into my app (license-permitting, of course), but that seemed brittle since it wouldn’t update as those files update, plus I ran into issues with the CMake build script
getting away with only using the code in module.h like the SwiftPM example (https://pytorch.org/executorch/main/_static/img/swiftpm_xcode.mp4) does, but for my use case it seems like I’d need most of the text-specific stuff (tokenizer, text decoder, text prefiller, text token generator) so that didn’t work

Thanks!

shoumikhin · December 2, 2024, 8:01pm

Hi Eric,

Sorry for a late reply, I just noticed your question.

At the moment, I’m afraid you need to copy the corresponding files from extension/llm and setup your project to build them, including the 3rd party dependencies, like RE2 or Abseil.

The demo app and the benchmark app are examples of such setup, where we build certain files from extension/llm directly with Xcode, and build the rest of 3rd party dependencies with Cmake using a custom run script build phase in Xcode. Let me know which issues you run into with the cmake invocation and I’ll try to help there. Note that we also leverage .xcconfig files for more convenient build settings tuning.

I know that setup it less than ideal, that’s why we’re actively working on providing the LLaMA runner API as a standalone SwiftPM package, so that eventually the customers would just integrate it in their Xcode projects natively and skip all the Cmake hassle.

Thanks.