Is there a way to reindex a matrix?

wanjunhong0 · April 15, 2021, 10:10am

Let’s say a index matrix or embedding lookup matrix is not started from 0 to n.

a = torch.randint(10, 100, (10, 10))
old_index = torch.unique(a)
new_index = torch.arange(len(old_index))

is there way to map the new index to the matrix in Pytorch?

Something like this in Pandas

a.map(dict(zip(old_index, new_index)

or numpy

a = torch.randint (10, 100, (10, 10))
b = a.numpy()
old_index = np.unique(b)
new_index = np.arange(len(old_index))
index_map = dict(zip(old_index, new_index))

np.vectorize(index_map.get)(b)

KFrank · April 15, 2021, 2:20pm

Hi Junhong!

If I understand your use case, .scatter_() should work:

>>> import torch
>>> torch.__version__
'1.7.1'
>>> _ = torch.manual_seed (2021)
>>> a = torch.randint (10, 100, (10, 10))
>>> old_index = torch.unique (a)
>>> new_index = torch.arange (len (old_index))
>>> imap = -torch.ones (a.numel()).long()
>>> imap.scatter_ (0, old_index, new_index)
tensor([-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,  0,  1, -1, -1,  2, -1, -1,
         3,  4,  5, -1,  6, -1,  7,  8,  9, -1, -1, 10, -1, 11, -1, 12, 13, 14,
        -1, 15, -1, -1, 16, 17, 18, 19, -1, 20, 21, 22, 23, -1, 24, 25, 26, 27,
        -1, 28, 29, 30, 31, 32, 33, 34, -1, 35, 36, 37, 38, 39, 40, 41, 42, 43,
        44, -1, 45, -1, 46, 47, 48, 49, -1, 50, -1, -1, 51, -1, -1, 52, 53, 54,
        -1, 55, 56, -1, -1, 57, -1, 58, -1, 59])
>>> old_index[50]
tensor(81)
>>> imap[old_index[50]]
tensor(50)

Best.

K. Frank

wanjunhong0 · April 16, 2021, 6:39am

Hi Frank,

I can’t quite follow you. What I want is to map index matrix a.
This is how to do it in numpy, but I just wonder is there a way to do it in Pytorch？

a = torch.randint (10, 100, (10, 10))
b = a.numpy()
old_index = np.unique(b)
new_index = np.arange(len(old_index))
index_map = dict(zip(old_index, new_index))

np.vectorize(index_map.get)(b)