Using quantized operators: Embedding bag C++ API

Abhiraj_Kanse · September 10, 2024, 6:27am

Hi,
I am trying to call the quantized C++ API of embedding bag operator [embedding bad 4 bit rowwise offsets] CPP API but i face issues in doing so.

when i try to call it through the at::native namespace it throws the below error while compiling:

error: ‘embedding_bag_4bit_rowwise_offsets’ is not a member of ‘at::native’ 30 | torch::Tensor emb = at::native::embedding_bag_4bit_rowwise_offsets(weight,input,offsets,false, 0, false, nullptr, nullptr, false);

I see that the README.md ATEN native quantization here mentions at the end that the “** You should not need to use the registered kernels in C++. Although officially not supported , you can use the following**”.
Does this mean that we cannot call the c++ api of the quantized embedding bag ?

Anyway I tried the method mentioned in there and it still throws error while compiling:

error: ‘const class c10::OperatorHandle’ has no member named ‘call’
7 | return op.call<torch::Tensor, torch::Tensor, torch::Tensor, torch::Tensor, bool, int, bool, torch::Tensor, torch::Tensor, bool>(weight, indices, offsets, scale_grad_by_freq, mode, pruned_weights, per_sample_weights, compressed_indices_mapping, include_last_offset);

I am basically confused if it is possible to use these quantized operators in CPP and if so how?
(using in python seems pretty easy “torch.ops.quantized.<operator_name>”)