Difference between `torch.qint8` and `torch.int8`

Just curious why do we need qint8 when there is already int8? Is it because qint8 has a different and more efficient binary layout than that of the int8? Thanks!

  1. int8 is an integer type, it can be used for any operation which needs integers
  2. qint8 is a quantized tensor type which represents a compressed floating point tensor, it has an underlying int8 data layer, a scale, a zero_point and a qscheme

One could use torch.int8 as a component to build quantized int8 logic, that’s not how PyTorch does it today but we actually plan to converge towards this approach in the future.