Pytorch Int2 training and export

khn4hi · April 29, 2024, 7:43am

Hello, I am aware that during the training we can make model aware of the low precision kernels and activations. However, once training is over how to export the model in int2 or int1 using torch. Is there any tool from torch itself available? if not, is there any equivalent of larq in tensorflow in torch?

jerryzh168 · May 1, 2024, 9:40pm

these are not supported natively right now, but we’ll support them through tensor subclasses like: ao/torchao/dtypes/uint4.py at main · pytorch/ao · GitHub, for now I think you could simulate these with torch.uint8 + quant_min/quant_max restrictions to the range of the desired dtype during quantization and then lower to the correct quantized ops in the end.