Does pytorch support training with low-precision INT8?

rps · April 26, 2018, 4:21pm

I am trying to experiment low-precision training - INT8. Is there support for this in pytorch?. Also what quantization methods are supported?

Marat · June 28, 2019, 8:59am

Training with int8 - no chance due to numerical stability limitations I guess, but inference on int8 is very interesting.

linsonghere · September 11, 2019, 8:54am

however when I use INT8 to compute, it doesn’t faster even slower, i wonder why

Marat · September 16, 2019, 11:59am

Pytorch does not support efficient INT8 scoring, and if you do not have Volta you will not gain any speed gain on both train and score on fp16. If you want fast scoring in int8 consider using of TensorRT you will get up to 3x faster scoring on ResNet like nets on INT8 with “slightly” lower accuracy

polarbaryon · September 16, 2019, 5:34pm

I am not aware of any native 8 bit or lower training, or for that matter, inference, as compared with something like tflite, which only supports it in specific instances. Partially I assume this has to do with the fact that there is no canonical method for quantization. There are a number of implementation examples though, see, e.g., https://github.com/eladhoffer/quantized.pytorch, or Glow.

tom · September 16, 2019, 5:49pm

If you look at the github issues or PRs or even the git tree’s test directory, you’ll find there is good progress towards a comprehensive solution of the various quantisation strategies.

Best regards

Thomas

linsonghere · October 12, 2019, 11:11am

I have tried to transform my pytorch model into ONNX model, and transform it into TensorRT model, but I met an unexpected error(using yolov3.onnx downloaded from offical web), it said “ERROR: Network must have at least one outout”, have you ever met this problem?