When should I use Tracing rather than Scripting?

I went through the official doc of TorchScript (https://pytorch.org/docs/stable/jit.html) but didn’t understand clearly what is the advantage of Tracing over Scripting.

Af far as I understood, both jit.script and jit.trace can convert existing nn.Module instances into TorchScript. However Tracing cannot handle control flow such as if/for and it also requires an example input. The inability to handle control flow sounds like a huge deal breaker.

The only disadvantage of Scripting I noticed is that it cannot handle several builtin modules like RNN/GRU.

Are there any reasons to use Tracing over Scripting?

Thank you.

BTW, can LSTM be scripted? It is a little odd that RNN and GRU cannot be scripted but LSTM can.

5 Likes

Tracing lets you use dynamic behavior in Python since it just records tensor operations. This may work better for your use case, but as you pointed out there are some fundamental limitations such as the inability to trace control flow / Python values. They both should be pretty easy to try out on your codebase (i.e. call trace() or script() on a module). If tracing does not work out of the box it is unlikely that you can maintain the semantics of your model and get it working under tracing (i.e. if you use control flow). Scripting may require some work to get your model using only features supported by the compiler, but you will probably be able to get it working with some code changes.

GRU and LSTM can both be compiled on master/nightly, (AFAIK only LSTM is in the v1.2.0 release), we haven’t had many requests for RNN yet. If you’d like to see it (or something else in PyTorch) be script-able that isn’t already, please file an issue on GitHub.

@driazati Thank you for reply.

Do you mean that Tracing can handle almost all Python features/libraries except for for/while/if kind flows while Scripting can only handle a subset of Python features (aka TorchScript)? I still suspect there are many Python features that cannot be used with Tracing and therefore the difference between Tracing and Scripting is very small. Could you kindly give me an example where we should use Tracing over Scripting?

Tracing can handle anything that uses only PyTorch tensors and PyTorch operations. If someone passed a PyTorch tensor to a Pandas dataframe and did some operations, tracing wouldn’t capture that (though neither would script at this point), so there are limitations. If the only data flowing around your computations are tensors and there is no control flow, tracing is probably the way to go. Otherwise, use scripting.

The pytext library uses a mix of scripting and tracing, and it all generally works well since they can be mixed together pretty seamlessly.

I have got similar confusion.

Seems that even If “the only data flowing around your computations are tensors and there is no control flow”, we can still use scripting. Does this mean we can actually use scripting for all cases? Any speed difference between the two?

Thanks.

Right, the only thing that would work in tracing but not scripting is use of Python language features and dynamic behavior that script mode doesn’t support. Since scripting compiles your code, you may have to do some work to make the compiler happy (e.g. add type annotations). But it is easy to try, just pass your module to torch.jit.script. The two compile to the same IR under the hood, so the speed should be about the same.

1 Like

Hi,
I don’t understand what do you mean by dynamic behaviour and python language features that is supported in trace but not in script? also

what is the difference between Torchscript compiler and JIT compiler?

Scripting a function or  `nn.Module`  will inspect the source code, 
compile it as TorchScript code using the TorchScript compiler. 
Trace a function and return an executable 
that will be optimized using just-in-time compilation.

I request you to explain those in detail.

Thanks.

Hi, would you be able to give an example of dynamic behavior / Python language features that script mode doesn’t support? It sounds a bit abstract to me compared to the control flow statements that I know only work with script mode.

It was also abstract for me in which can trace would be superior/more useful than scripting. I finally found an example in my code

import torch
from torch import nn


class MyModule(nn.Module):
    def __init__(self, return_b=False):
        super().__init__()
        self.return_b = return_b

    def forward(self, x):
        a = x + 2
        if self.return_b:
            b = x + 3
            return a, b
        return a


model = MyModule(return_b=True)

# Will work
traced = torch.jit.trace(model, (torch.randn(10, ), ))

# Will fail
scripted = torch.jit.script(model)

This can easily be changed to be scriptable, but if you know the control flow is static once the model is exported, tracing will work just fine.