Scalar operation observers

Scalar op observers doesn’t seem to record statistics.

class HardSigmoid(nn.Module):
    def __init__(self):
        super().__init__()
        self.act = nn.ReLU6()
        self.add = nn.quantized.FloatFunctional()
        self.mul = nn.quantized.FloatFunctional()
    
    def forward(self, input):
        output = self.add.add_scalar(input, 3)
        output = self.act(output)
        output = self.mul.mul_scalar(output, 1/6)
        return output

class Residual(nn.Module):
    def __init__(self, *layers):
        super().__init__()
        self.layers = nn.Sequential(*layers)
        self.add = nn.quantized.FloatFunctional()

    def forward(self, input):
        return self.add.add(input, self.layers(input))

model = Residual(HardSigmoid())
model.qconfig = torch.quantization.default_qconfig
torch.quantization.prepare(model, inplace=True)
model(torch.rand(4,3,2,1))

model.add.activation_post_process.min_val has a value, but model.layers[0].mul.activation_post_process.min_val and model.layers[0].add.activation_post_process.min_val is None

add_scalar and mul_scalar actually do not need observer. It could get the quantized parameter from the input qtensor and also compute the quantized parameter for the output qtensor.

I’m glad there is no issue and it works as designed. But still it’s a bit strange for me to use FloatFunctional if it doesn’t record any state. Why not simply overload scalar operations at convert time?

Eager mode quantization is based on module level replacement. Current flow didn’t do the function level swapping, though it should be feasible to replace it with quantized kernel.