Per Tensor/Channel quantization equivalents in PyTorch/Caffe2

The weights obtained from PyTorch per tensor quantization of Conv2d can be used in Int8Conv from caffe2. But from Int8Conv definitions, I understood that it only accepts scale as a float and not an array.

Is it possible to use the PerChannel quantization in Caffe2?

Unfortunately, Caffe2 Int8Conv doesn’t support per-channel quantization. The DNNLOWP engine that uses FBGEMM backend does support group-wise quantization if that helps you. Please see for example of using group-wise quantization.

1 Like