Static Quantization of UNet

Surya_J · October 19, 2021, 8:45am

Hi,

I am trying to quantize a UNet model using builtin static quantization functions.

Pytorch CPU version 1.9.1
Ubuntu 20.04 LTS (conda env)

The model itself is referenced from here. I modified the model as follows (showing the quantization parts alone) :

class UNet(nn.Module):
    def __init__(self, num_classes, quantize=False):        
        super(UNet, self).__init__()
        self.num_classes = num_classes
        
        """ QUANTIZED VERSION ADDITIONS """
        self.quantize = quantize
        self.quant = torch.quantization.QuantStub()
        self.dequant = torch.quantization.DeQuantStub()        

    def forward(self, X):
        # Outputs are dequantized
        if self.quantize == True:
            output_out = self.dequant(output_out)
        
        # pass through other layers
        
        # Outputs are dequantized
        if self.quantize == True:
            output_out = self.dequant(output_out)
        return output_out

The quantization function is as follows:

def quantizeUNet(model, device, dataLoader, use_fbgemm=False):
    model.to(device)    
    model.eval()    
    modules_to_fuse =  [['contracting_11.0', 'contracting_11.2'],
                        ['contracting_11.3', 'contracting_11.5'],
                        ['contracting_21.0', 'contracting_21.2'],
                        ['contracting_21.3', 'contracting_21.5'],
                        ['contracting_31.0', 'contracting_31.2'],
                        ['contracting_31.3', 'contracting_31.5'],
                        ['contracting_41.0', 'contracting_41.2'],
                        ['contracting_41.3', 'contracting_41.5'],
                        ['middle.0', 'middle.2'],
                        ['middle.3', 'middle.5'],
                        ['expansive_12.0', 'expansive_12.2'],
                        ['expansive_12.3', 'expansive_12.5'],
                        ['expansive_22.0', 'expansive_22.2'],
                        ['expansive_22.3', 'expansive_22.5'],
                        ['expansive_32.0', 'expansive_32.2'],
                        ['expansive_32.3', 'expansive_32.5'],
                        ['expansive_42.0', 'expansive_42.2'],
                        ['expansive_42.3', 'expansive_42.5']]
    #print(modules_to_fuse)
    model = torch.quantization.fuse_modules(model, modules_to_fuse)
    if use_fbgemm == True:
        model.qconfig = torch.quantization.get_default_qconfig('fbgemm')
    else:
        model.qconfig = torch.quantization.default_qconfig
    torch.quantization.prepare(model, inplace=True)
    
    ## Calibrate Quantization parameters on input dataset
    print('Calibrating Quantization parameters on input dataset ...')
    model.eval()
    with torch.no_grad():
        for data, target in dataLoader:
            model(data)
    torch.quantization.convert(model, inplace=True)
    print('### Static Quantization complete ###')
    return model

During inference, the output tensor (shape [1, 10, 256, 256]) contains 0s only.

I expected the output to have probabilities for each class (10 classes in total). But its essentially zero matrix. Is there something I’m missing? How to do static quantization of the model correctly?

Vasiliy_Kuznetsov · October 19, 2021, 12:27pm

Surya_J:

    def forward(self, X):
        # Outputs are dequantized
        if self.quantize == True:
            output_out = self.dequant(output_out)
        
        # pass through other layers
        
        # Outputs are dequantized
        if self.quantize == True:
            output_out = self.dequant(output_out)
        return output_out

It might be a typo, but it should be something like

# quantize inputs (currently your code doesn't have this in what you pasted)
# run quantized model
# dequantize outputs

Other than that, your code looks right, as long as your calibration dataset is representative of your inference dataset. Are you sure the expected output from your input is not close to a matrix of zeros? Do you get the same thing for other outputs?

You could also try using PyTorch Numeric Suite Tutorial — PyTorch Tutorials 2.1.1+cu121 documentation to see if you can bisect the difference to a specific layer.

Surya_J · October 19, 2021, 3:20pm

Hi @Vasiliy_Kuznetsov,

Thanks for the reply. Yes, I pasted the wrong code here. This is my actual code is :

def forward(self, X):
    # Input are quantized
    if self.quantize == True:
        X = self.quant(X)

The output is zero for the entire test set (I’m using a subset of the CityScapes dataset). The un-quantized model gives floating point output and the predictions are good. So, I assume there’s something missing when I quantize the model. I’ll try debugging using the link you shared. Thanks for the help.

Surya_J · November 1, 2021, 7:09am

Update

It seems that ConvTranspose2d is not yet supported for quantization. Hence, you have to dequantize the output before passing through each of the unsupported layers, which is slower than the original float model in my case. Related Forum post
I guess it;s better to look for models which contain only supported layers in case of static quantization.

jerryzh168 · November 5, 2021, 10:43pm

I think convtranspose is supported: pytorch/quantization_mappings.py at master · pytorch/pytorch · GitHub

Surya_J · November 7, 2021, 12:13am

Hi @jerryzh168,

Thanks for reaching out.

Based on the forum post, I jumped to that conclusion that ConvTranspose2d is not supported yet. Will look into the model and check for the problem.