Is `torch.ao.quantization` being migrated to `torchao.quantization`?

I see that these two modules are very different code bases:

But they share basically the same name. torchao seems to be updated more frequently, so is it aiming to replace the other?

Looks like this question was asked in an issue under torchao repo, here. And msaroufim responded:

eventually once things get proven out in torchao they would get upstreamed to torch.ao so the goal is to have this be a standalone repo to have a higher development velocity.

So looks like I got it backwards in my question.

Yeah, in the past, quantization has been a largely cpu focused technique, gpu models tended to focus on CNNs which are harder to speed up with quantization. As a result, the suite of tools in torch/ao reflects that. Its only in the last 1.5 years where transformers and perf optimized gpu use cases have become a much bigger area that we started focusing on gpu quantization. Core Pytorch has a higher release standard and its own release cadence that doesn’t work well for us at the moment. Its easier to move fast in that repo even if we eventually want to consolidate back into core down the line.

1 Like