Hi guys, I created a custom simulated quantized Conv2d layer and replaced all of the convolutional layers in Ultralytics YOLOv8 with this to train it successfully. How can I incorporate pytorch’s quantized layers by replacing my custom quantized layers with pytorch’s builtin layers, as well as loading the parameters (weight, bias, scale, zero point) from my custom layer to take advantage of the speedup during inference?
that might be tricky since you’d need to match numerics as well, wondering why you need to implement custom quantized layers if you want to use pytorch’s builtin layers in the end?