Convert floating point 32 bit of input and pretrained weight to 8bit

  • I am using alexnet model ,where 7layers are binarized(input and weight),1st layer is not binarized(input,weight are floating point 32 bit).I want only 1st layer input,weight to be converted to 8 bit before sending into the convolution function without harming the other.

  • i am using pretrained weight here

Just to make it clear – when you say “convert to 8bit” are you using quantization or are you just casting the types down? Also, we don’t support quantization lower than 8 bits, so binarization of the layers might not be supported without custom hacks.

Lastly, if you already have the weights, and you just need an 8-bit model, you can follow these steps:

  1. Make sure your model is quantizable – all layers in your network must be stateful and unique, that is, no “implied” layers in the forward and no inplace computation
  2. Prepare the model using prepare function
  3. Calibrate the prepared model by running through your data AT LEAST once
  4. Convert your model to the quantized version.

You can follow the PTQ tutorial here: https://pytorch.org/tutorials/advanced/static_quantization_tutorial.html

On the first point:

This model cannot be quantized:

class Model(nn.Module):
  def __init__(self):
    super(Model, self).__init__()
    self.relu = nn.ReLU(inplace=True)
  def forward(self, a, b):
    ra = self.relu(a)
    rb = self.relu(b)
    return ra + rb

To make the model quantizable, you need to make sure there are no inplace operations, and every operation can save the state:

class Model(nn.Module):
  def __init__(self):
    super(Model, self).__init__()
    self.relu_a = nn.ReLU(inplace=False)
    self.relu_b = nn.ReLU(inplace=False)
    self.F = nn.quantized.FloatFunctional()
  def forward(self, a, b):
    ra = self.relu_a(a)
    rb = self.relu_b(b)
    return self.F.add(ra, rb)

If you want to have the model take FP input and return the FP output you will need to insert the QuantStub/DequantStub at the appropriate locations:

class Model(nn.Module):
  def __init__(self):
    super(Model, self).__init__()
    self.quant_stub_a = torch.quantization.QuantStub()
    self.quant_stub_b = torch.quantization.QuantStub()
    self.relu_a = nn.ReLU(inplace=False)
    self.relu_b = nn.ReLU(inplace=False)
    self.F = nn.quantized.FloatFunctional()
    self.dequant_stub = torch.quantization.DeQuantStub()
  def forward(self, a, b):
    qa = self.quant_stub_a(a)
    qb = self.quant_stub_b(b)
    ra = self.relu_a(qa)
    rb = self.relu_b(qb)
    return self.dequant_stub(self.F.add(ra, rb))

Similarly, if you would like to only quantize a single layer, you would need to place the quant/dequant only where you want to quantize. Please, note that you would need to specify the quantization parameters appropriately:

class Model(nn.Module):
  def __init__(self):
    super(Model, self).__init__()
    self.quant_stub_a = torch.quantization.QuantStub()
    self.relu_a = nn.ReLU(inplace=False)
    self.relu_b = nn.ReLU(inplace=False)
  def forward(self, a, b):
    qa = self.quant_stub_a(a)
    ra = self.relu_a(qa)
    a = self.dequant_stub(ra)
    rb = self.relu(b)
    return ra + rb

The model above will be partially quantizable, and you would need to give the qconfig to the quant_stub and the relu only.