torch::Tensor nonzero process time is too long?what is the reason?

currently i run the torch::Tensor nonzero to implement the superpoint feature extraction algorithm.the nonzero function consume about 30% time,640X360 image consume about 11ms time,what’s the reason?

Are you running on CPU or GPU?
The main problem with this function is that it does not know in advance how many non-zero elements there are and so it is tricky to allocate the output properly in advance.

How can i reduce the processing time about torch::Tensor nonzero about gpu

Can you give more information about how you use it and what is the general goal around using it? Maybe you can do without?