# How to achieve efficient greedy Hamiltonian path construction?

I have N C-dimensional points to form a tensor `Data` with a shape of `(C, N)`. Given a starting point with its index in `Data` (e.g. 0), I want to implement an efficient greedy Hamiltonian path construction algorithm as follows:

``````1. Initialize a list S = [], and the remaining set R = {all N data points};
2. Pick the starting point from R to S;
3. Using a loop to pick points one-by-one from R to S, in each round,
find the nearest point in R from the last point in S (measured by L2
distance), move it out from R and append to S;
4. When R is empty, return S and end.
``````

where each point is represented by its corresponding index in `Data` (i.e. a number in {0,1, …, N-1}), and the returned S is actually a permutation which can be applied on `Data` to get the ordered result. It’s relatively easy to implement the above algorithm by for-loops, but I am pursuing a more efficient way based on PyTorch. The trivial implementation could be with O(N^2) time complexity, and I’m wondering if this can be achieved with O(N*log(N)) time complexity and O(N) memory. Does there exists an API or a combination of fast operations to achieve this? (When the point dimensionality equals to 1, the result can be directly obtained by `torch.Tensor.sort().indices`, if I choose the largest value as the starting point)