Scalar operation observers

pshashk · November 13, 2019, 10:18am

Scalar op observers doesn’t seem to record statistics.

class HardSigmoid(nn.Module):
    def __init__(self):
        super().__init__()
        self.act = nn.ReLU6()
        self.add = nn.quantized.FloatFunctional()
        self.mul = nn.quantized.FloatFunctional()
    
    def forward(self, input):
        output = self.add.add_scalar(input, 3)
        output = self.act(output)
        output = self.mul.mul_scalar(output, 1/6)
        return output

class Residual(nn.Module):
    def __init__(self, *layers):
        super().__init__()
        self.layers = nn.Sequential(*layers)
        self.add = nn.quantized.FloatFunctional()

    def forward(self, input):
        return self.add.add(input, self.layers(input))

model = Residual(HardSigmoid())
model.qconfig = torch.quantization.default_qconfig
torch.quantization.prepare(model, inplace=True)
model(torch.rand(4,3,2,1))

model.add.activation_post_process.min_val has a value, but model.layers[0].mul.activation_post_process.min_val and model.layers[0].add.activation_post_process.min_val is None

lly-zero-one · November 22, 2019, 7:26am

add_scalar and mul_scalar actually do not need observer. It could get the quantized parameter from the input qtensor and also compute the quantized parameter for the output qtensor.

pshashk · November 22, 2019, 8:13am

I’m glad there is no issue and it works as designed. But still it’s a bit strange for me to use FloatFunctional if it doesn’t record any state. Why not simply overload scalar operations at convert time?

lly-zero-one · November 23, 2019, 2:15am

Eager mode quantization is based on module level replacement. Current flow didn’t do the function level swapping, though it should be feasible to replace it with quantized kernel.