How can I find unique values in a tensor quickly in libtorch c++?

I want to use unique in libtorch and I use unique_dim. However, finding 20 unique elements in a vector of 100 million elements take half a minute. In pytorch the same operation takes a fraction of a second using the Torch.unique function. So how should I go about to do fast unique in libtorch c++?

(Im only interested in the CPU version)