Hi Andrew,
as my model structures are a bit complicated, I need to use eager quantization. What I meant with “making models quantizable” is including all necessary modules for QAT or PTQ already, whether using it or not.
For example, instead of defining my model as:
class Model(nn.Module):
def __init__(self, nc_in: int=3):
super().__init__()
self.conv = nn.Conv2d(nc_in, 64, 3, 1, 1, bias=False)
self.bnorm = nn.BatchNorm2d(64, affine=True)
self.relu = nn.ReLU(inplace=True)
def forward(self, x):
return x + self.relu(self.bnorm(self.conv(x)))
and then fusing it, adding QuantStub/DeQuantstub etc. later or even maintain an additional model only for QAT, I could simply define my model as
from torch.ao.nn.intrinsic import ConvBnReLU2d
from torch.ao.nn.quantized import FloatFunctional
class QuantizableModel(nn.Module):
def __init__(self, nc_in: int=3):
super().__init__()
self.quant = QuantStub()
self.conv_bnorm_relu = ConvBnReLU2d(
nn.Conv2d(nc_in, 64, 3, 1, 1, bias=False),
nn.BatchNorm2d(64, affine=True),
nn.ReLU(inplace=True),
)
self.add = FloatFunctional()
self.dequant = DeQuantStub()
def forward(self, x):
x = self.quant(x)
x = self.add.add(x, self.conv_bnorm_relu (x))
x = self.dequant(x)
return x
and use it for training whether using QAT or not.
Thus, I would not have to deal with fusing modules or adding QuantStubs as everything is already specifically defined for this type of model. All that was required would be adding a qconfig and calling prepare_qat
if use_qat:
model.qconfig = torch.ao.quantization.get_default_qconfig('x86')
torch.quantization.prepare_qat(model, inplace=True)
And in case of not using QAT nothing would change compared to the previous model.
Well, the latter is my actual question as I was uncertain whether there are any pitfalls, any other changes conducted in torch.ao.quantization.fuse_modules_qat, or in the fused placeholders that might be an obstacle