Select k random rows without replacement

pbloem · December 29, 2017, 2:52pm

Let say I have a (variable containing a) matrix of 100 by 1000, and I want to split this matrix into 10 random rows, and the remaining 90 rows.

What’s the most natural way to do this in pytorch? I guess you would make a length-100 ByteTensor with 10 ones and 90 zeros and use that to index. But how do you make such a tensor efficiently? Do you concat tensors with ones and zeros and shuffle the result? That seems slow…

A faster approach would be to create a ByteTensor where each element has a probability 0.1 of being 1, and using that to select. This results in selections of different sizes, but works very quickly. Is this always quicker than selecting a fixed number of rows?

smth · December 29, 2017, 2:56pm

prob something using torch.bernoulli will be efficient