Does using a 'lite' model and 'optimized for mobile' change anything if used in native python code?

prashp · May 24, 2024, 6:45pm

After training a model, I have decided to serialize it using torch.jit.trace and then optimize_for_mobile and finally _save_for_lite_interpreter.

However my end goal is to utilize this model in another python program (using torch.load()) running on a much weaker PC, NOT on a mobile device or in a C++ program.

Is there any point in doing the serialization, optimization, and saving it to be compatible with the lite interpreter? All examples I’ve seen online only address targeting Android/iOS or C++ programs.

prashp · June 5, 2024, 7:22pm

I’ve noticed I 3x inference speedup on a desktop intel cpu when using optimize_for_mobile and _save_for_lite_interpreter compared to using just the torchscript model. Is there any explanation what’s happening behind the scenes?

The docs also mention that torch.compile() is an option (with the openvino backend for example).