Shuffling a Tensor

Hi Everyone -

Is there a way to shuffle/randomize a tensor. Something equivalent to numpy’s random.shuffle.

Thanks!

3 Likes

Just index with a tensor of random indices.

5 Likes

You could use torch.randperm to create random indices and then index it using @asml’s suggestion.

11 Likes

@asml , @ptrbick - thanks both. is something like this

t=torch.tensor([[1,2],[3,4]])
r=torch.randperm(2)
c=torch.randperm(2)
t=t[r][:,c]

The most elegant solution? Its simple enough but the indexing in t[r][:,c] feels a bit odd

9 Likes

You could merge the indexing or use view instead:

t=torch.tensor([[1,2],[3,4]])
r=torch.randperm(2)
c=torch.randperm(2)
t=t[r[:, None], c]

# With view
idx = torch.randperm(t.nelement())
t = t.view(-1)[idx].view(t.size())
20 Likes

Oh smart – I like the .view() solution, especially since nbelement and size are fixed. The merged-indexing is nice but probably no less awkward than my first go.
Thanks again!

Don’t do this, it is not a real random transformation!

indeed:
The number of possible transformations for a N x N square matrix: (N*N)!
Or, with two permutations of the lines and the columns as you do, there are (N!)*(N!) possible transformation
And (N*N)! is far higher than (N!)*(N!) when N is high…
with you code, the matrix

t=torch.tensor([[1,2],[3,4]])

will never be randomized into

t=torch.tensor([[1,4],[2,3]])

Use the code of @ptrblck with the view, it is a good one :wink:

6 Likes

Hello,
If we want to shuffle the order of image database (format: [batch_size, channels, height, width]), I think this is a good method:

t = torch.rand(4, 2, 3, 3)
idx = torch.randperm(t.shape[0])
t = t[idx].view(t.size())

t[idx] will retain the structure of channels, height, and width, while shuffling the order of the image.

8 Likes

Exactly what I was looking for. Thanks, mate!

This code works, but the result changes at every run. How can I make it deterministic?

1 Like

Seed the pseudorandom number generator via torch.manual_seed(SEED) before using the random operation.

1 Like

If it’s on CPU then the simplest way seems to be just converting the tensor to numpy array and use in place shuffling :

t = torch.arange(5)             
np.random.shuffle(t.numpy())
print(t) 
# tensor([0, 2, 3, 1, 4])

For numpy parity, it would be handy to have torch.shuffle()

For batch-first shuffling:

tzr = torch.tensor([
    [[1],[1],[1]],
    [[2],[2],[2]],
    [[3],[3],[3]],
    [[4],[4],[4]],
])

rand_indx = torch.randperm(len(tzr))

tzr[rand_indx]

returns

tensor([
    [[3], [3], [3]],
    [[4], [4], [4]],
    [[2], [2], [2]],
    [[1], [1], [1]]
])