Question on quantize_per_channel and dequantize

jerryzh168 (Jerry Zhang) April 6, 2025, 12:15am 6

for CPU int4 weight only quant, you can check out this: Quantized LLM inference vs quantized matrix multiplication speed in CPU - #3 by jerryzh168