Inference using model weights and without pytorch

payman · May 26, 2024, 1:20pm

Is it possible to extract weights of a trained model, and after that, do the inference just by matrix multiplication in any language or system, without the need for torchscript or exporting model to other formats?

(My problem is that I have a model trained in python (Tacotron 2 for TTS), now I want to use this model in C++ and also in mobile devices, for this it is required to first use torchscript to compile the model, but it seems that this model is not scriptable! I’ve already posted about this here )

sree_harsha · May 26, 2024, 1:39pm

Hi,

While it is not exactly the model you are referring to (TacoTron) – people have exactly shown this to be possible:

Some examples:

[1] GGML - has structures implemented in C for inference of several cutting edge models.
[2] whisper.cpp - same author, but for one model.

Could adapt a similar approach and build on top of these methods – hope this helps!