Hello,

why do i have to choose the activation quantization of QAT to be quint8 ? Is there a way to get passed that ?

I used the qconfig of one of the people here from the forum:

activation_bitwidth = 8 #whatever bit you want

bitwidth = 8 #whatever bit you want

fq_activation = torch.quantization.FakeQuantize.with_args(observer=torch.quantization.MinMaxObserver.with_args(

quant_min=-(2 ** bitwidth) // 2,

quant_max=(2 ** bitwidth) // 2 - 1,

dtype=torch.qint8,

qscheme=torch.per_tensor_symmetric,

reduce_range=False,))

fq_weights = torch.quantization.FakeQuantize.with_args(

observer = torch.quantization.MinMaxObserver.with_args(

quant_min=-(2 ** bitwidth) // 2,

quant_max=(2 ** bitwidth) // 2 - 1,

dtype=torch.qint8,

qscheme=torch.per_tensor_symmetric,

reduce_range=False,))

intB_qat_qconfig = torch.quantization.QConfig(activation= fq_activation,weight = fq_weights)