Optimize_for_mobile methods from preserved_methods not available in Android/Java via runMethod

m-toman · August 10, 2021, 1:24pm

Hi,

I successfully exported a module with a method:

    @torch.jit.export
     def inference(self, text_inputs, speaker_embed, style_embed):

as

ts_prosody = torch.jit.script(model.prosody_generator)
ts_prosody.save(f"../inference/{model_name}_prosody.pt")
ts_prosody_mobile = optimize_for_mobile(ts_prosody, preserved_methods=["inference", "inference_half"])
torch.jit.save(ts_prosody_mobile,
                      f"../inference/{model_name}_prosody_mobile.pt")

I verified that in Python both ts_prosody_mobile.inference() and ts_prosody_mobile.inference_half() exist.

Now in Java/Android I load the models with Module.load

then
IValue[] prosodyOutput = modelProsody.runMethod("inference", ... );
works.

But
IValue[] prosodyOutput = modelProsodyMobile.runMethod("inference", ... );

gives me
java.lang.IllegalArgumentException: Undefined method inference

Is there anything I missed?

I am using in Python:
torch.version
‘1.9.0+cu111’

And Java from the Gradle file:
implementation ‘org.pytorch:pytorch_android:1.9.0’

Thanks!

guangy10 · August 10, 2021, 5:01pm

@JacobSzwejbka could you help with this?

m-toman · August 11, 2021, 9:59am

Hi,

I’ve tried to work around the issue as:

class ProsodyGeneratorExport(ProsodyGenerator):
    def __init__(self, hparams):
        super().__init__(hparams)

    def forward(self,
                text_inputs: Tensor,
                speaker_embed: Optional[Tensor]=None,
                style_embed: Optional[Tensor]=None):
        return self.inference(text_inputs, speaker_embed, style_embed)

And calling optimize_for_mobile without the preserved_methods parameter:

ts_prosody_mobile = optimize_for_mobile(ts_prosody)

ts_prosody_mobile.forward() can be called.
Exporting with torch.jit.save and then loading into Java again:

        IValue[] prosodyOutput = modelProsody.forward(
                IValue.from(textInputTensor),
                IValue.from(speakerInputTensor),
                IValue.from(styleInputTensor)).toTuple();

gives me

com.facebook.jni.CppException: Method 'forward' is not defined.
    Exception raised from get_method at ../../../../src/main/cpp/libtorch_include/arm64-v8a/torch/csrc/jit/api/object.h:103 (most recent call first):
    (no backtrace available)

Again the same code loading the model without optimize_for_mobile works.

So I ended up instead refactoring my classes from forward/inference to forward_train/forward and now it works.

Only that (without any further quantization or fuse calls) the optimize_for_mobile version runs 5 seconds for a given task while the unoptimized one runs for 4 seconds.

kimishpatel · August 25, 2021, 1:57pm

Without knowing the details of the model it might be a bit harder. Are you profiling only one call to forward or multiple?

m-toman · August 26, 2021, 6:39am

Multiple calls but it’s 3 different models in sequence that are called.

Model A and B are rather heavy on GRUs and LSTMs together with a few (1D) convolutions.
B also uses some Gaussian upsampling mechanism with a bit of manual implementation.
Both are significantly slower in the “optimize for mobile” version.
The biggest part of it is this layer

github.com

as-ideas/ForwardTacotron/blob/203daff3e90536d8c4f8ca8f28df5364361ce1d1/models/common_layers.py#L39

    
      
                  self.conv = nn.Conv1d(in_channels, out_channels, kernel, stride=1, padding=kernel // 2, bias=False)
                  self.bnorm = nn.BatchNorm1d(out_channels)
                  self.relu = relu
          
          
    def forward(self, x: torch.Tensor) -> torch.Tensor:
                  x = self.conv(x)
                  x = F.relu(x) if self.relu is True else x
                  return self.bnorm(x)
          
          

          
class CBHG(nn.Module):
          
          
    def __init__(self,
                           K: int,
                           in_channels: int,
                           channels: int,
                           proj_channels: list,
                           num_highways: int,
                           dropout: float = 0.5) -> None:
                  super().__init__()

The rest is also quite similar to the model in this repo.

Model C is mostly transposed 1D convolutions and here the performance is pretty much the same (looks like this melgan-neurips/modules.py at 6488045bfba1975602288de07a58570c7b4d66ea · descriptinc/melgan-neurips · GitHub)