Hi, I have defined a neural network with a fully connected layer and applied Post Training Static Quantization for quantization. The version I use for pytorch is 2.0.0+cu118.

Here is the network architecture and the quantization process:

```
class HPC(nn.Module):
def __init__(self, input_features, out_features):
super(HPC, self).__init__()
self.linear1 = nn.Linear(input_features, 512)
self.linear2 = nn.Linear(512, 256)
self.linear3 = nn.Linear(256, 64)
self.linear4 = nn.Linear(64, out_features)
self.sigmoid = nn.Sigmoid()
self.quant = torch.ao.quantization.QuantStub()
self.dequant = torch.ao.quantization.DeQuantStub()
def forward(self, n_input):
out = self.quant(n_input)
out = (self.linear1(out))
out = (self.linear2(out))
out = (self.linear3(out))
out = self.sigmoid(self.linear4(out))
out = self.quant(out)
return out
```

```
import copy
#copy the trained model
myModel = copy.deepcopy(model)
myModel.eval()
myModel.qconfig = torch.ao.quantization.default_qconfig
print(myModel.qconfig)
torch.ao.quantization.prepare(myModel, inplace=True)
for i in range(200):
X_train_arr_tensor = torch.Tensor(X_train_arr[i].flatten()).cuda()
X_train_arr_tensors = torch.unsqueeze(X_train_arr_tensor, 0)
myModel(X_train_arr_tensors)
torch.ao.quantization.convert(myModel, inplace=True)
```

When I completed the above operation, I tried to print out the quantized network parameters and the results are shown below.

```
for i in myModel.state_dict():
print(i)
```

linear1.scale

linear1.zero_point

linear1._packed_params.dtype

linear1._packed_params._packed_params

linear2.scale

linear2.zero_point

linear2._packed_params.dtype

linear2._packed_params._packed_params

linear3.scale

linear3.zero_point

linear3._packed_params.dtype

linear3._packed_params._packed_params

linear4.scale

linear4.zero_point

linear4._packed_params.dtype

linear4._packed_params._packed_params

leaky.scale

leaky.zero_point

quant.scale

quant.zero_point

When I printed out myModel.linear1._packed_params, I found that the scale and zero_point values in _packed_params were not equal to myModel.linear1.scale and myModel.linear1.zero_point.

```
print(myModel.linear1._packed_params)
```

(tensor([[-0.0381, -0.0122, 0.0366, …, -0.0059, 0.0067, -0.0362],

[ 0.0196, 0.0037, -0.0277, …, 0.0074, -0.0362, 0.0126],

…,

[ 0.0126, 0.0403, -0.0104, …, 0.0359, -0.0107, 0.0263]],

device=‘cuda:0’, size=(512, 504), dtype=torch.qint8,

quantization_scheme=torch.per_tensor_affine,scale=0.00036982630263082683,

zero_point=0), Parameter containing:

tensor([-2.7267e-02, -4.1525e-02, 3.3713e-02,…, -3.9943e-02], device=‘cuda:0’, requires_grad=True))

```
print("this is scale: ",myModel.linear1.scale)
print("this is zero_point:",myModel.linear1.zero_point)
```

this is scale: 0.030707480385899544

this is zero_point: 63

There are no quantized weights and biases for the fully connected layers here. May I know how can I obtain the weights and parameters for each layer?