Indexing a Variable with a mask generated from another Variable

yusuf_isik · February 3, 2017, 8:04pm

x and y are two matrices. I want to find the indices of nonzero entries in a column of y (let’s say the first column), and take the corresponding rows of x.
In numpy, I can do:

x[y[:,0] > 0]

In theano:
x[(y[:,0]>0).nonzero()]

What is the most appropriate way to do this in Pytorch?
I can do:
m = y[:,0] > 0
x[m.unsqueeze(1).expand_as(x)].resize(m.long().sum().data[0], x.size()[1])

But I assume there should be a much easier way.

Thanks.

apaszke · February 3, 2017, 8:19pm

Yeah, we’ve added the ability to select based on the long tensor a few days ago. I think it’s in the new binaries, so once you reinstall pytorch this should work:

x[(y[:, 0] > 0).nonzero().squeeze()]

apaszke · February 3, 2017, 8:19pm

And in the future, we’re going to support automatic broadcasting so the numpy way should work in some time too.

yusuf_isik · February 3, 2017, 8:20pm

That’s very good news. Thanks.

yusuf_isik · February 3, 2017, 8:57pm

I just reinstalled from scratch , but it cannot find nonzero() function. I am getting the following message:

File “/home/yusuf/anaconda2/envs/pytorch-bin-env/lib/python2.7/site-packages/torch/autograd/variable.py”, line 86, in _getattr_
raise AttributeError(name)
AttributeError: nonzero

I also have a version that I built from source, and I get the same error from it, too.

How can I get the right version? Thanks.

apaszke · February 3, 2017, 9:01pm

It will work on tensors only right now, as we don’t have nonzero() implemented for Variables. I’ll add that soon. I’m afraid for now you need to do it like you originally wrote, sorry for the trouble!

acgtyrant · June 12, 2017, 12:30pm

Four months passed, we still can not use nonzero() for Variables…

apaszke · June 12, 2017, 1:18pm

You can always open an issue in the main repo