Inference on iOS and GPU

Hi,
I’ve spent a couple of days to get MobileSAM (Segment Anything based on TinyVIT) working on iOS on GPU, but so far I’ve only managed to get a CPU version running, with a latency of 2 seconds.

What would be the options for speeding this up…?

  • pytorch mobile with metal backend: pytorch lite doesn’t support a couple of operators I had to work around, but in the end the lack of support for strided convolution for the metal backend became a blocker.
  • pytorch MPS backend: There is no information about whether this even targets mobile
  • Apple CoreMLTools: Convert via ONNX - only a subset of operators supported
  • ONNX runtime?
  • Convert to TFLite via ONNX? I got some weird metadata there that made this a blocker (operator names where operator type names should be)
  • pytorch ExecuTorch: In development, earliest end of 2024

It seems this has not been a priority for either Apple or Pytorch to make this work and support it end-to-end… Is there a way out?

1 Like