```
import torch
from torch import nn
from torch.quantization import quantize_dynamic
class UnifiedModel(nn.Module):
def __init__(self):
super(UnifiedModel, self).__init__()
self.linear1 = nn.Linear(15, 10)
self.conv = nn.Conv1d(10, 10, 1)
self.linear2 = nn.Linear(10, 5)
def forward(self, x):
x = self.linear2(self.conv(self.linear1(x).transpose(-1, -2)).transpose(-1, -2))
return x
model = UnifiedModel()
# dynamic_quantization
quantized_model = torch.quantization.quantize_dynamic(
model, {nn.LSTM, nn.Linear}, dtype=torch.float16
)
```

While calculating parameters for both quantized and non-quantized:

```
print("Number of parameters before:", sum(p.numel() for p in model.parameters()))
print("Number of parameters before:", sum(p.numel() for p in quantized_model.parameters()))
```

Results:

```
Number of parameters before: 325
Number of parameters before: 110
```

My intuition was that the dynamic quantization supports for linear and LSTM layer. so, in this case, linear layer performs operation in float16, which is the case

```
print(quantized_model)
```

Result:

```
UnifiedModel(
(linear1): DynamicQuantizedLinear(in_features=15, out_features=10, dtype=torch.float16)
(conv): Conv1d(10, 10, kernel_size=(1,), stride=(1,))
(linear2): DynamicQuantizedLinear(in_features=10, out_features=5, dtype=torch.float16)
)
```

But why the parameters of the quantized model decreases in this case? Arenâ€™t we just decreasing the precision type.

also while printing parameters:

Non-Quantized Model:

```
for name, param in model.named_parameters():
print(name, param.shape)
```

Result:

```
linear1.weight torch.Size([10, 15])
linear1.bias torch.Size([10])
conv.weight torch.Size([10, 10, 1])
conv.bias torch.Size([10])
linear2.weight torch.Size([5, 10])
linear2.bias torch.Size([5])
```

For Quantized Model:

```
for name, param in quantized_model.named_parameters():
print(name, param.shape)
```

result:

```
conv.weight torch.Size([10, 10, 1])
conv.bias torch.Size([10])
```

In quantized model, there is no any parameters for linear layers. also, while infering where does it store its parameters?