Indexing a Variable with a mask generated from another Variable

x and y are two matrices. I want to find the indices of nonzero entries in a column of y (let’s say the first column), and take the corresponding rows of x.
In numpy, I can do:

x[y[:,0] > 0]

In theano:
x[(y[:,0]>0).nonzero()]

What is the most appropriate way to do this in Pytorch?
I can do:
m = y[:,0] > 0
x[m.unsqueeze(1).expand_as(x)].resize(m.long().sum().data[0], x.size()[1])

But I assume there should be a much easier way.

Thanks.

2 Likes

Yeah, we’ve added the ability to select based on the long tensor a few days ago. I think it’s in the new binaries, so once you reinstall pytorch this should work:

x[(y[:, 0] > 0).nonzero().squeeze()]
2 Likes

And in the future, we’re going to support automatic broadcasting so the numpy way should work in some time too.

That’s very good news. Thanks.

I just reinstalled from scratch , but it cannot find nonzero() function. I am getting the following message:

File “/home/yusuf/anaconda2/envs/pytorch-bin-env/lib/python2.7/site-packages/torch/autograd/variable.py”, line 86, in _getattr_
raise AttributeError(name)
AttributeError: nonzero

I also have a version that I built from source, and I get the same error from it, too.

How can I get the right version? Thanks.

It will work on tensors only right now, as we don’t have nonzero() implemented for Variables. I’ll add that soon. I’m afraid for now you need to do it like you originally wrote, sorry for the trouble!

1 Like

Four months passed, we still can not use nonzero() for Variables…

You can always open an issue in the main repo

2 Likes