Pytorch 1.5 - is Torchscript still required? Do pad tokens require computation?

Damiox · May 13, 2020, 3:59pm

I see Pytorch 1.5 has brought a lot of improvements for C++ APIs. I’ve got a model that I’m using in Python. Would I still need to use Torchscript in the middle if I wanted to load this model from C++/Java? I guess so because it’s in Python and I need it to be compiled in JIT so I can load it from other languages. Please confirm this.

Also, my current concern about Torchscript is committing to a given sentence length. My model is GPT-2 and I just want to avoid padding as much as possible because I think pad tokens are still being calculated in the neural network. I’m not sure about this point though. Should I have multiple Torchscript models with different sentence lenghts? I believe having a single Torchscript model with sentences of 1024 tokens may affect performance for sentences with < 100 tokens as the majority of those tokens will be pad tokens e.g… If there’s no computation for them, then I’ll be safe. How is this concept usually being handled?

albanD · May 13, 2020, 4:10pm

Hi,

For the first point, I think the cpp API has a very different goal. If you were using it, you would need to reimplement your model in CPP and then load_state_dict like you would do in python.
TorchScript does not require that as it allows you to serialize the whole model with the weights.

For the tokens in your model I’m not sure But if you use torchscript, it should just behave like the python version of your model. So you just need to make sure not to do extra padding on the python side.

Damiox · May 13, 2020, 5:25pm

What if I would be using scripting (torch.jit.script) instead of tracing (torch.jit.trace) with Torchscript? Should I still have to commit to a fixed sentence length if I were using torch.jit.script ?

albanD · May 13, 2020, 5:33pm

if you use script, you won’t. The branching will work fine.
If you use tracing, you will have to keep fixed sequence indeed.