# Convert floating point 32 bit of input and pretrained weight to 8bit

• I am using alexnet model ,where 7layers are binarized(input and weight),1st layer is not binarized(input,weight are floating point 32 bit).I want only 1st layer input,weight to be converted to 8 bit before sending into the convolution function without harming the other.

• i am using pretrained weight here

Just to make it clear â€“ when you say â€śconvert to 8bitâ€ť are you using quantization or are you just casting the types down? Also, we donâ€™t support quantization lower than 8 bits, so binarization of the layers might not be supported without custom hacks.

Lastly, if you already have the weights, and you just need an 8-bit model, you can follow these steps:

1. Make sure your model is quantizable â€“ all layers in your network must be stateful and unique, that is, no â€śimpliedâ€ť layers in the forward and no inplace computation
2. Prepare the model using `prepare` function
3. Calibrate the prepared model by running through your data AT LEAST once
4. Convert your model to the quantized version.

On the first point:

This model cannot be quantized:

``````class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.relu = nn.ReLU(inplace=True)
def forward(self, a, b):
ra = self.relu(a)
rb = self.relu(b)
return ra + rb
``````

To make the model quantizable, you need to make sure there are no inplace operations, and every operation can save the state:

``````class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.relu_a = nn.ReLU(inplace=False)
self.relu_b = nn.ReLU(inplace=False)
self.F = nn.quantized.FloatFunctional()
def forward(self, a, b):
ra = self.relu_a(a)
rb = self.relu_b(b)
``````

If you want to have the model take FP input and return the FP output you will need to insert the `QuantStub`/`DequantStub` at the appropriate locations:

``````class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.quant_stub_a = torch.quantization.QuantStub()
self.quant_stub_b = torch.quantization.QuantStub()
self.relu_a = nn.ReLU(inplace=False)
self.relu_b = nn.ReLU(inplace=False)
self.F = nn.quantized.FloatFunctional()
self.dequant_stub = torch.quantization.DeQuantStub()
def forward(self, a, b):
qa = self.quant_stub_a(a)
qb = self.quant_stub_b(b)
ra = self.relu_a(qa)
rb = self.relu_b(qb)
``````

Similarly, if you would like to only quantize a single layer, you would need to place the quant/dequant only where you want to quantize. Please, note that you would need to specify the quantization parameters appropriately:

``````class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.quant_stub_a = torch.quantization.QuantStub()
self.relu_a = nn.ReLU(inplace=False)
self.relu_b = nn.ReLU(inplace=False)
def forward(self, a, b):
qa = self.quant_stub_a(a)
ra = self.relu_a(qa)
a = self.dequant_stub(ra)
rb = self.relu(b)
return ra + rb
``````

The model above will be partially quantizable, and you would need to give the qconfig to the quant_stub and the relu only.

2 Likes