How to use slicing/selecting in pytorch tensor using locations Boolean?

Let’s say I have the indices which I optained using
locs = Train_y == 0

in numpy I can just do
Train_negx = Train_x[np.invert(locs)]

but in pytorch it is not working, what is the equivalent to this method ? Knowing I’m working on output Variable.
I guess the problem is in the invert version

Train_y == 0 doesn’t return the indices of the matching elements, it returns a tensor full of zeros and ones indicating which elements match the condition.

np.invert(locs) transforms the zeros to 255, and the ones to 254. The docs say that np.invert is a bitwise negation. Now in numpy locs would be an array of booleans (i.e. one bit per element) whereas in pytorch locs is a tensor of bytes (i.e. 8 bits per element). That is why using np.invert pytorch doesn’t work as expected.

Besides it is unwise to use numpy operations on Variables. Use the pytorch tensor ops where possible.

Train_negx = Train_x[~locs] should do what you want.

Should I take indicies before or after converting the input into FloatTensor ?

if after should I just use locs = Train_y == 0 and Train_negx = Train_x[~locs] ?
if before should I convert locs into Tensor and which type of tensor ?

Assuming Train_y is a Variable, then locs is already a ByteTensor. The following should just work.

locs = Train_y == 0
Train_negx = Train_x[~locs]

or in one line

Train_negx = Train_x[~(Train_y == 0)]

I started like this, when I used .sum() or torch.sum() it is giving me exactly the same sum and that’s why I start change the code. what couldbe the reason the sum is not equal to the same value ?

When you used np.invert it would select all the elements of Train_x, with ~ it should select only the elements that match the condition.

You can check how many elements match the inverted condition using

inverted = ~locs
print("Percent matching elements", 100. * inverted.sum().float() / inverted.numel())

I wrote the same code before and after converting the input into Tensorfloat here is the output
locs0 = 270
locs1 = 242
after TensorFloat and calculating the sum
locs0 = 14
locs1 = 242

when taking the values Train_x[~locs] or Train_x[locs] it gives the correct answer. it is just the sum is wrong

I don’t really understand what is going on here. Can you post a little more code, or sample values of Train_y and Train_x that we can play with?

print(Train_y.shape)
locs0 = (Train_y == 0)
locs1 = (~locs0)
print(locs0.sum())
print(locs1.sum())

(512,)
270
242
Train_x, Train_y = torch.FloatTensor(Train_x), torch.FloatTensor(Train_y)
 locs0 = (Train_y == 0)
 locs1 = (~locs0)
print(torch.sum(locs0))
print(torch.sum(locs1))
8
242
print(x5.shape)
x5_negx = x5[locs0]
x5_posx = x5[locs1]
print(x5_negx.shape)
print(x5_posx.shape)

torch.Size([512, 1, 100, 100])
torch.Size([264, 1, 100, 100])
torch.Size([248, 1, 100, 100])

So Train_y is originally a numpy array, right?
What dtype is it?

I think a torch.FloatTensor generally contains 32 bit floats, I’m guessing that the numpy array isn’t a 32 bit float, maybe the conversion is altering the precision a little.

Another point concerns the limitations of floating point precision which can be seen with this example

print(1 - .8 - .2)
# outputs -5.551115123125783e-17 on my machine

Train_y == 0 means exactly equal which is probably too precise when Train_y is some sort of floating point datatype. It might make sense to change Train_y == 0 into abs(Train_y) < epsilon where epsilon is chosen to be some really small value such as 1e-8.

Train_y is just classes 0 or 1. it is converted into TensorFloat just for the reason of cross entropy.
even though the sum is giving wrong output still indexing train_x on locs gives correct output.

Weird.
But if your code works, then that is the important thing.

Yes it works, I thought maybe there is something wrong that I can’t see it.

Another question so quick.

do I have to set zero_grad after each x.backward(retain_graph=True)

Just a thought. If Train_y contains only zeros and ones, then if you set locs = Train_y == 0 then Train_y should be equal to ~locs excepting differences of type.

You should run zero_grad after calling optimizer.step and before the next call to .backward

I’m calling backward twice before optimizer.step. in this case gradient zero is necessary after each backward or not ?

Well, if you want the gradients of the first backward to be used in the update, then you shouldn’t run zero_grad after it.

When you run two backward passes one after the other, the calculated grads get added together.

1 Like