Problem with static quantization

pizuzadan · November 17, 2020, 7:22pm

Hi!
I have some model:

class SomelLayer(nn.Module):
    # some code here
    def forward(self, x):
        dim = 3
        mean_x = x.mean(dim)
        mean_x2 = (x * x).mean(dim)
        std_x = torch.nn.functional.relu(mean_x2 - mean_x*mean_x).sqrt()
    #some code

I want to use static quantization.

class SomelLayer(nn.Module):
    def __init__(self, mode):
        super(SomelLayer, self).__init__()
        self.quant = QuantStub()
        self.dequant = DeQuantStub()

        self.f_mul = torch.nn.quantized.FloatFunctional()
        self.f_add = torch.nn.quantized.FloatFunctional()
        self.f_add_relu = torch.nn.quantized.FloatFunctional()
        self.f_mul_scalar = torch.nn.quantized.FloatFunctional()

    def forward(self, x):
        dim = 3
        x = self.quant(x)
        mean_x = x.mean(dim)
        mean_x2 = self.f_mul.mul(x, x).mean(dim)
        std_x = self.f_add_relu.add_relu(mean_x2, self.f_mul_scalar.mul_scalar(self.f_mul.mul(mean_x, mean_x), -1.0))
        std_x = std_x.sqrt()
        # some code
        x = self.dequant(x)
        return x

But I’m getting error during inference:

Traceback of TorchScript, original code (most recent call last):
  File ".../model.py", line 235, in forward
        std_x = self.f_add_relu.add_relu(mean_x2,
                                         self.f_mul_scalar.mul_scalar(self.f_mul.mul(mean_x, mean_x), -1.0))
        std_x = std_x.sqrt()
                ~~~~~~~~~~ <--- HERE
RuntimeError: Could not run 'aten::empty.memory_format' with arguments from the 'QuantizedCPU' backend. 'aten::empty.memory_format' is only available for these backends: [CPU, CUDA, MkldnnCPU, SparseCPU, SparseCUDA, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].

Do I have to implement the sqrt operator myself to solve this problem? As here: https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/quantized/README.md
I have similar problems with other operators.

pizuzadan · November 18, 2020, 10:21am

Solution for this problem:

self.quant0 = QuantStub()
self.dequant0 = DeQuantStub()

std_x = self.dequant0(std_x)
std_x = std_x.sqrt()
std_x = self.quant0(std_x)

Vasiliy_Kuznetsov · November 18, 2020, 5:38pm

Hi @pizuzadan, yes, a quantized kernel for sqrt is not implemented at the moment. Doing it in fp32 is the workaround.