I am trying to manually set the weight values after QAT training on each conv layer to 4 unique values.
I printed the number of unique values after this operation using the below code:
for name,param in quantized_model.named_parameters():
print(name, len(np.unique(param.cpu().detach().numpy())))
if name == ‘model_fp32.layer1.0.conv1.weight’:
print(param, np.unique(param.cpu().detach().numpy()))
This gave 4 unique weight values per layer.
But, when I do torch.quantization.convert(quantized_model, inplace=True)
with qconfig as torch.quantization.QConfig(activation=torch.quantization.MinMaxObserver.with_args(qscheme=torch.per_tensor_affine, dtype=torch.quint8), weight=torch.quantization.MinMaxObserver.with_args(qscheme=torch.per_tensor_affine, dtype=torch.qint8))
I get 4 unique uint8 values per channel per layer although I chose qscheme as torch.per_tensor_affine.
Can someone please let me know what I am doing wrong?
The end goal is to get 4 unique weight values per layer in the int8 model as well.