# Filter Out Undesired Rows

I am generating artifical data. I would like to filter out the rows of my input tensor that don’t satisfy a certain condition and then save the indices so that I can remove the corresponding rows from my output tensor.

1 Like

Would it be possible to use your condition to index the tensor?
Here is a small example:

``````x = torch.randn(10, 2)
condition = x > 0.
row_cond = condition.all(1)
x[row_cond, :]
``````

I have a function that computes a value using all the inputs in a row, and I would like to filter out the rows that don’t meet a certain threshold for that value.

Let me know if you want me to be more specific

Could you use this calculated value to index your tensor?
Let’s say your function computes the sum of each row:

``````value = x.sum(1)
x[value>threshold]
``````

If that doesn’t work, could you post the shape of your tensor and the value you’ve calculated?

A small example of what I am doing, with 5 training examples:

Input:
tensor([[ 0.0166, -0.2023, -0.2503, -0.3227, -0.2823, 0.8440],
[ 0.4075, 0.0052, -0.7873, -0.3248, 0.1329, 0.3014],
[ 0.2826, 0.4441, 0.2709, 0.4514, 0.2911, 0.6008],
[-0.1225, 0.0034, -0.2977, 0.3847, 0.5563, 0.6625],
[-0.3808, -0.5172, 0.4302, -0.2792, 0.1753, 0.5419]]
Output:
tensor([[ 0.2294, -0.2380, 0.2742, -0.0511, 0.4272, 0.2381, -0.1149, -0.8085,
0.2283, -0.8853, 0.1314, 0.0665, -0.2199, 0.8177, 0.0667, 0.4147],
[ 0.4232, -0.5899, -0.3844, 0.9617, -0.9795, -0.0679, -0.0792, 0.7093,
-0.0951, 0.2633, -0.0480, -0.5599, -0.5668, -0.4858, -0.9084, -0.6490],
[ 0.2353, 0.6581, 0.0493, -0.4584, 0.4395, -0.3839, -0.2215, -0.5482,
-0.3140, -0.9266, 0.4267, 0.3888, 0.1986, 0.4910, 0.4238, 0.0442],
[ 0.1059, 0.0764, 0.5336, 0.6717, 0.7181, 0.5796, -0.2438, -0.0445,
-0.2032, 0.5817, 0.1111, 0.9255, 0.5072, -0.8547, 0.2925, 0.9609],
[ 0.8882, -0.0157, 0.3318, -0.9381, -0.3188, 0.4876, -0.9110, 0.8712,
-0.6576, 0.3162, -0.0379, 0.1762, 0.0969, -0.9348, -0.2148, -0.6321]])

I have a function theta that takes a row of my input tensor and outputs a value.

if I apply theta to each row of my input I get:
tensor([0.2515, 0.2275, 0.2988, 0.2581, 0.2819])

Now if I select 0.26 as my threshold value I would like to filter out rows 0,1,3 from my input and output tensors and be left with just 2 and 4.

Thanks for the example.
Let’s call the first tensor `input`, the second `output` and the last `theta`.
Would the following work:

``````threshold = 0.26
idx = theta > threshold
input_filt = input[idx]
output_filt = output[idx]
``````

If you want to keep the negation, you could use:

``````input_filt = input[~idx]
output_filt = output[~idx]
``````

or just flip the `>` sign.

Let me know, if that’s what you want.

1 Like

@ptrblck - This is a very elegant solution. Thank you. Is there a way to apply this in the case where the mask should remove values that are equal to the previous value?

For example, suppose a tensor has:
tensor([
[0, 2.4],
[0, -4.1],
[1, 0.5],
[0, 1.6]
])
And so the mask should only keep the last two values since in the first column they are not equal to the previous value.
tensor([
[1, 0.5],
[0, 1.6]
])

How would you suggest applying a mask in this case?

Figured it out. Pretty simple actually.

``````condition = x[1:, 0:1] != x[:-1, 0:1]
row_cond=condition.all(1)
y=x[1:,:]

y=y[row_cond,:]
``````

Cheers!