It it possible to quantize a scripted model?

Hi, I was wondering if it is possible to quantize a scripted model with either eager or graph mode.

For example, there is a model scripted by torch.jit.trace() in advance. Technically, the scripted model should already both structure and weights. With that, is it possible to quantize it with either eager or graph mode?

Thanks all!

cc @jerryzh168 @James_Reed

1 Like

we do have an api for Torchscript models before: pytorch/quantize_jit.py at master · pytorch/pytorch · GitHub but it’s been de-prioritized and deprecated. The current recommendation is to quantize the model in python, with either eager mode quantization or FX Graph Mode Quantization.

@jerryzh168 How exactly FX Graph model quantization considered as replacement for jit_quantize? Can we import a torchscript model via symbolic tracing and then quantize in torch.fx?

What’s the reason for deprecating/deprioritizing jit_quantize?

FX Graph Mode Quantization is a replacement in the sense that it’s also the automatic or graph mode quantization flow provided by PyTorch Quantization, it can’t quantize a TorchScript model.

The reason for the change is because of internal strategy changes of PyTorch Compiler team, they decided to deprecate TorchScript and we have dependency on their support, so we have to switch. Actually we are in the middle of discussion to switch to a new IR due to upcoming changes to PyTorch Compiler/PyTorch Core, but that is still in discussion.

@jerryzh168 Thanks for a quick response. Do you mean whole torchscript IR is going to be deprecated or just the quantize_jit api? And Is there any public discussion about the new (hopefully export/FX friendly) IR?

Is there way to export torch.fx IR to a self-contained file/checkpoint with no code dependency , such that it can be loaded in a PyTorch environment for torch.fx transformation on a different machine with no access to original source?

by deprecated I’m mostly talking about investment at Meta, I’m not sure what is the message to OSS actually. at Meta, both TorchScript IR and quantize_jit api are deprecated. at least for quantize_jit, we are not adding new features/fixes anymore, not exactly sure about Torchscript though.

“And Is there any public discussion about the new (hopefully export/FX friendly) IR?” which one are you referring to here? if you are referring to the next IR after FX IR, we do not have public information yet.

Yes, what you described is torch.package (for packaging code into a self-contained format) and torch::deploy (for deploying code in a different environment).

So basically you can roughly think of TorchScript being decomposed into torch.fx, torch.package (torch.package — PyTorch 2.1 documentation) and torch::deploy (torch::deploy has been moved to pytorch/multipy — PyTorch 2.1 documentation), you can also checkout https://www.youtube.com/watch?v=vXbbaEZbrOI (starting at 1:01:49) for the presentation.

1 Like

I watched the presentation and looks like TorchScript is still in use, it’s not been deprecated

1 Like