I’m currently running some customized conv2d layers of a ResNet18 quantized model. My goal is just to express the convolution in terms explicit matrix multiplication.
For the first convolution layer, for example, I notice that the input tensor is of type torch.uint8, and the filters are of type torch.int8. The output of the function ops.quantized.conv2d(input_tensor, conv params) are of type torch.uint8.
Given that a convolution is essentially the product of inputs (type torch.uint8) by the filters (type torch.int8), I don’t understand how this computation is performed. I would expect the output to be of a signed type with at least 16 bits. Why, so, the outputs are of type torch.uint8?