Fast operation for separating values of a tensor into two tensors

I’m looking for a fast operation than can separate the positive and negative values of a tensor X to two tensors of the same size.
One way to do this is using maximum and minimum like below:

neg = torch.minimum(X, 0)
pos = torch.maximum(X, 0)

In the above solution, each element of X gets compared to 0 twice. But really the comparison only needs to happen once, because if you know X < 0 is true then you already know X > 0 is False.
So you can cut the computation in half.
If we call that operation sieve, then it would look like this:
pos, neg = torch.sieve(X, condition=(X > 0))

Is there an operation in torch that can achieve that?

Would torch.where work for you? torch.where — PyTorch 1.10.0 documentation

However, I’m not familiar with the implementation details/potential performance tradeoffs, so you might want to check the final speed of whatever approach you go with (e.g., just because one approach only does a single read of data/comparison doesn’t mean it is guaranteed to be faster).

where kinda does the inverse of what I’m looking for.
My function takes one tensor as input and returns two tensors. where does the opposite.

Good catch, in that case I’m not sure there is a clear alternative other than computing a mask and then reusing it for indexing (which would require more reads than the min/max approach anyway).

Is this a performance bottleneck in your application?

Yes, it is an operation that is performed on feature maps of each layer. Like an activation function.
It would make my runs 10~15% faster overall.

Any idea where to start if I were to write a custom CPU and kernel code for this?

Created a feature request for it here: A `sieve` operation for separating values of a tensor into two tensors based on a condition. · Issue #67745 · pytorch/pytorch · GitHub