Quantization of weights after QAT training

I am trying to manually set the weight values after QAT training on each conv layer to 4 unique values.
I printed the number of unique values after this operation using the below code:
for name,param in quantized_model.named_parameters():
print(name, len(np.unique(param.cpu().detach().numpy())))
if name == ‘model_fp32.layer1.0.conv1.weight’:
print(param, np.unique(param.cpu().detach().numpy()))

This gave 4 unique weight values per layer.
But, when I do torch.quantization.convert(quantized_model, inplace=True)
with qconfig as torch.quantization.QConfig(activation=torch.quantization.MinMaxObserver.with_args(qscheme=torch.per_tensor_affine, dtype=torch.quint8), weight=torch.quantization.MinMaxObserver.with_args(qscheme=torch.per_tensor_affine, dtype=torch.qint8))

I get 4 unique uint8 values per channel per layer although I chose qscheme as torch.per_tensor_affine.
Can someone please let me know what I am doing wrong?

The end goal is to get 4 unique weight values per layer in the int8 model as well.

Hi Madhumitha,

Do you mind sharing the code where you manually set the weights? The call to convert swaps some layers in the model so it’s possible this interfered with the weight values. Have you tried setting the weights manually after convert?

Best,
-Andrew

Hi Andrew,
Thank you for the reply! I found the issue though.
I initially set it to 4 unique values per layer. However, the batchnorm folding performed just before quantization changes each channel’s weights by the batchnorm params and therefore I saw 4 unique values per channel.
Thanks for your input though!

Madhu