Hello,
I am trying to write custom quantized operator. This is my first dive into the cpp code base and it seems difficult for me to understand how cpu_kernel_vec
from Loops.h
works.
Would you mind to help with this example?
AT_DISPATCH_QINT_TYPES(out.scalar_type(), "qmul", [&]() {
using Vec = Vec256<scalar_t>;
cpu_kernel_vec(
iter,
[&](scalar_t a, scalar_t b) > scalar_t {
int32_t a_sub_z = static_cast<int32_t>(a.val_) 
static_cast<int32_t>(self_zero_point);
int32_t b_sub_z = static_cast<int32_t>(b.val_) 
static_cast<int32_t>(other_zero_point);
int32_t c = a_sub_z * b_sub_z;
scalar_t res = at::native::requantize_from_int<scalar_t>(
multiplier, zero_point, c);
if (ReLUFused) {
res.val_ = std::max<scalar_t::underlying>(res.val_, zero_point);
}
return res;
},
[&](Vec a, Vec b) > Vec {
Vec::int_vec_return_type a_sub_zp =
a.widening_subtract(Vec(static_cast<scalar_t>(self_zero_point)));
Vec::int_vec_return_type b_sub_zp =
b.widening_subtract(Vec(static_cast<scalar_t>(other_zero_point)));
Vec::int_vec_return_type c;
for (int i = 0; i < Vec::int_num_vecs(); ++i) {
c[i] = a_sub_zp[i] * b_sub_zp[i];
}
Vec rv = Vec::requantize_from_int(c, multiplier, zero_point);
if (ReLUFused) {
rv = rv.maximum(Vec(static_cast<scalar_t>(zero_point)));
}
return rv;
});
});
}
So here I can see that cpu_kernel_vec
takes

iter
that data to process,  function that processes scalar values,
 function that processes vectors.
I have hard times to understand when the first is called. Does it overrides Vec
multiplication operator? If so how does it know which operator to override.
Confused