Trace works, script doesn't + some questions on JIT deployment and performance

KoenT_AS · July 7, 2022, 10:19am

Hi,

I’m trying to use TorchScript JIT (tracing / scripting) to improve inference performance on a PyTorch model (and perhaps make deployment easier too). I read the documentation, including the differences between trace and script.

After modifying a few parts of the code to only use Tensors, I got tracing to work.
I also verified that the inference results for a complete set of examples is exactly the same between the original and jit.traced model, and measured speed improvement seems to be around 15% (inference time went down from 780 ms to 660 ms).

This is what I tried (it’s part of my Classifier class):

with torch.no_grad():
    self.model.eval()
    # self.jit_model = torch.jit.script(self.model)  # error
    # self.jit_model = torch.jit.script(self.model, example_inputs=[(example,)])  # error
    self.jit_model = torch.jit.trace(self.model, example_inputs=(example,))  # works
    # self.jit_model = torch.jit.trace(self.model, example_inputs=(example,), strict=False)  #works

Now, I have a few questions:

I don’t seem to get jit.script to work on the model: see below for the error I’m seeing.
How do I find out which part of my code specifically causes this error (and thus jit.script to fail)? How would one go about finding the issue here, and moving forward to a fully scriptable model?

Traceback (most recent call last):
  File "C:\Code\test_classifier.py", line 107, in <module>
    classifier.initialize_jit_model()  # also initializes the original model
  File "C:\Code\classifier.py", line 202, in initialize_jit_model
    self.jit_model = torch.jit.script(self.model)
  File "C:\Users\ktanghe\anaconda3\envs\myenv_py3\lib\site-packages\torch\jit\_script.py", line 1265, in script
    return torch.jit._recursive.create_script_module(
  File "C:\Users\ktanghe\anaconda3\envs\myenv_py3\lib\site-packages\torch\jit\_recursive.py", line 454, in create_script_module
    return create_script_module_impl(nn_module, concrete_type, stubs_fn)
  File "C:\Users\ktanghe\anaconda3\envs\myenv_py3\lib\site-packages\torch\jit\_recursive.py", line 516, in create_script_module_impl
    script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
  File "C:\Users\ktanghe\anaconda3\envs\myenv_py3\lib\site-packages\torch\jit\_script.py", line 594, in _construct
    init_fn(script_module)
  File "C:\Users\ktanghe\anaconda3\envs\myenv_py3\lib\site-packages\torch\jit\_recursive.py", line 494, in init_fn
    scripted = create_script_module_impl(orig_value, sub_concrete_type, stubs_fn)
  File "C:\Users\ktanghe\anaconda3\envs\myenv_py3\lib\site-packages\torch\jit\_recursive.py", line 516, in create_script_module_impl
    script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
  File "C:\Users\ktanghe\anaconda3\envs\myenv_py3\lib\site-packages\torch\jit\_script.py", line 594, in _construct
    init_fn(script_module)
  File "C:\Users\ktanghe\anaconda3\envs\myenv_py3\lib\site-packages\torch\jit\_recursive.py", line 494, in init_fn
    scripted = create_script_module_impl(orig_value, sub_concrete_type, stubs_fn)
  File "C:\Users\ktanghe\anaconda3\envs\myenv_py3\lib\site-packages\torch\jit\_recursive.py", line 520, in create_script_module_impl
    create_methods_and_properties_from_stubs(concrete_type, method_stubs, property_stubs)
  File "C:\Users\ktanghe\anaconda3\envs\myenv_py3\lib\site-packages\torch\jit\_recursive.py", line 371, in create_methods_and_properties_from_stubs
    concrete_type._create_methods_and_properties(property_defs, property_rcbs, method_defs, method_rcbs, method_defaults)
  File "C:\Users\ktanghe\anaconda3\envs\myenv_py3\lib\site-packages\torch\jit\_recursive.py", line 872, in compile_unbound_method
    create_methods_and_properties_from_stubs(concrete_type, (stub,), ())
  File "C:\Users\ktanghe\anaconda3\envs\myenv_py3\lib\site-packages\torch\jit\_recursive.py", line 371, in create_methods_and_properties_from_stubs
    concrete_type._create_methods_and_properties(property_defs, property_rcbs, method_defs, method_rcbs, method_defaults)
  File "C:\Users\ktanghe\anaconda3\envs\myenv_py3\lib\site-packages\torch\_jit_internal.py", line 1047, in _try_get_dispatched_fn
    return boolean_dispatched.get(fn)
  File "C:\Users\ktanghe\anaconda3\envs\myenv_py3\lib\weakref.py", line 453, in get
    return self.data.get(ref(key),default)
TypeError: cannot create weak reference to 'numpy.ufunc' object

Given that jit.trace works, is there actually any benefit in trying to make jit.script work as well? Could the “scripted” model be (significantly) faster than the “traced” model I have now? Any other benfits?
I’m a bit unclear as to how to deploy the “traced” model: I know I can save it using self.jit_model.save(“model_traced.pt”). But I read something about the traced model still referencing the original code, which would not be the case for the “scripted” model.
Does that mean that for deployment with a traced model, you still need to have the original model code (and other code in other files that the model depends upon) present on the system you’re deploying to? And is this then not needed for a “scripted” model? Or did I get something wrong here?
About the speed improvement I’m seeing now with the “traced” model: is this the order of speed improvement that one would expect to see from a traced model (I had hoped for a bit more )? This is for a transfer-learning based model based on MobileNetv2, with 2 custom layers for audio feature extraction using librosa and numpy code, if this helps to make a judgment call.

Edit: I’m using Python 3.9.12 with PyTorch 1.11.0.

Thanks for any insights / advice!