I’ve spent a couple of days to get MobileSAM (Segment Anything based on TinyVIT) working on iOS on GPU, but so far I’ve only managed to get a CPU version running, with a latency of 2 seconds.
What would be the options for speeding this up…?
- pytorch mobile with metal backend: pytorch lite doesn’t support a couple of operators I had to work around, but in the end the lack of support for strided convolution for the metal backend became a blocker.
- pytorch MPS backend: There is no information about whether this even targets mobile
- Apple CoreMLTools: Convert via ONNX - only a subset of operators supported
- ONNX runtime?
- Convert to TFLite via ONNX? I got some weird metadata there that made this a blocker (operator names where operator type names should be)
- pytorch ExecuTorch: In development, earliest end of 2024
It seems this has not been a priority for either Apple or Pytorch to make this work and support it end-to-end… Is there a way out?