Am I correct in concluding that resnet that comes with pytorch can't be quantized by pytorch?

vladium · May 22, 2020, 1:00am

Greetings. I have gone through two quantization attempts for resnet50 that comes with pytorch and had mixed results:

dynamic quantization works but is limited to the only Linear layer used in ResNet, thus the resulting improvements in model size and inference latency are just a few percent.
static quantization nominally succeeds, but at runtime the new model throws the exception described in Supported quantized tensor operations, which I presume is caused by the “+” operation used to implement skip connections. It doesn’t seem feasible to exclude those as they repeat throughout the entire depth of the model. Am I correct in deducing then that the resnet implementation that ships with pytorch cannot be (correctly) statically quantized by the current API?

I understand that quantization support is marked experimental – I’d like to confirm that the limitations I am seeing are expected at this stage.

Thank you.

vladium · May 22, 2020, 3:01pm

Incidentally, I can reproduce the issue with a tiny test model: adding a += step to forward() makes it non-quantizable.

(BTW, I am aware of torch.nn.quantized.FloatFunctional – my use case prevents such intrusive model modifications)

supriyar · May 28, 2020, 4:46pm

At this point, eager mode quantization might require changes to the model in order to make it work. Here is an example of how resnet50 is quantized in pytorch - https://github.com/pytorch/vision/blob/master/torchvision/models/quantization/resnet.py

Going forward we are planning on graph mode quantization where such invasive model changes won’t be required.

vladium · May 28, 2020, 6:54pm

Thanks. I wanted to make sure I wasn’t missing anything obvious. The pre-quantized model works because of the changes including

def __init__(self, *args, **kwargs):
    ...
    **self.add_relu = torch.nn.quantized.FloatFunctional()**
...
    def forward(self, x):
        identity = x
        out = self.conv1(x)
        ...
        **out = self.add_relu.add_relu(out, identity)**

        return out

samhithaaaa · September 15, 2020, 11:58pm

Hi,I keep getting this error :

RuntimeError: Could not run ‘quantized::conv2d’ with arguments from the ‘CPUTensorId’ backend. ‘quantized::conv2d’ is only available for these backends: [QuantizedCPUTensorId].

Can someone help me with this?
Thanks!

Bryan_Wang · September 17, 2020, 6:33pm

Any update on graph mode quantization? Facing similar issues quantizing ResNet.

Vasiliy_Kuznetsov · September 18, 2020, 5:02pm

hi @Bryan_Wang, we cannot commit to a timeline yet but we are hoping to release it as a prototype this year. You are welcome to check out the test cases demonstrating the current API in https://github.com/pytorch/pytorch/blob/master/test/quantization/test_quantize_fx.py, although it will be in flux for the near future and we don’t have documentation just yet.

samhithaaaa · October 21, 2020, 9:21pm

Hi all,some layers are missing in the pretrained Resnet50 model.I don’t see Quantstub() and DeQuantStub() .Also,(Skip_add):FloatFunctional (adctivation_post_process):Identity()) is missing after every layer.I guess this is causing inference issues on quantized model and keeps giving this error:
RuntimeError: Could not run ‘quantized::conv2d’ with arguments from the ‘CPUTensorId’ backend. ‘quantized::conv2d’ is only available for these backends: [QuantizedCPUTensorId].

Any suggestions about this anyone please?

supriyar · October 27, 2020, 11:46pm

Hi @samhithaaaa, could you share the code/script that you are using to quantize the model?
If you are using fx based quantization you will likely not see QuantStub()/DeQuantStub() in the graph. cc @jerryzh168

jerryzh168 · November 9, 2020, 6:15pm

are you getting the model from https://github.com/pytorch/vision/tree/master/torchvision/models/quantization?