How to change [1,32] int10 tensor into [1,10] int32 tensor

Maxwell_Albert · April 9, 2022, 4:40pm

Here is a question about how to do bit-wise change to pytorch tensor.
I have a tensor with size [1,32] and type int16. But only 10 bits of 16 are useful.(since values are too small).
And I want to change it bit-wise to tensor with size [1,10] and type int32, which means the data or the binary value are the same.
How could I do this easily?

ptrblck · April 10, 2022, 12:07am

It seems you would like to “pack” the tensor somehow and store it as the int32 dtype.
In that case, you might want to write a custom C++ extension and perform the bit manipulations there.
Note that the int32 values would of course not represent the real values since you are not only packing but also interleaving the actual 10bit data.

Maxwell_Albert · April 10, 2022, 2:57am

Thank you!
Is there any tutorial or related notes about tensor extension I could refer to?

Matias_Vasquez · April 10, 2022, 5:38pm

I think this should do the trick.

def weird_format(tensor):
    #assert tensor.shape == torch.Size([1, 32])

    temp = tensor.numpy()
    temp = np.unpackbits(temp.view(np.uint8), axis=0)

    m_s = temp[:,1::2] # Most significant bits
    l_s = temp[:,::2]  # Least significatn bits

    temp = np.vstack((m_s, l_s))[-10:, :]

    print(temp) # To see the binary representation (y axis)

    temp = np.packbits(temp[:, ::-1]).view(np.int32)
    temp = torch.tensor(temp).unsqueeze(0)

    print(temp.T) # To see what the 'new' numbers are (taking 32 bits on the x axis)

    return temp

# Test
tensor = torch.arange(32, dtype=torch.int16).unsqueeze(0)

print(weird_format(tensor).shape)

# Output
# Up to down are the 10 least significant bits of the input 0-31 in binary
# Left to right are the 32 bits taken for the new numbers
[[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
 [0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1]
 [0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1]
 [0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1]
 [0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1]]

# If you put these numbers into a binary converter you get the values from above
tensor([[          0],
        [          0],
        [          0],
        [          0],
        [          0],
        [      65535],
        [   16711935],
        [ -252645136],
        [ -858993460],
        [-1431655766]], dtype=torch.int32)

# This is the shape that you wanted
torch.Size([1, 10])

Hope this helps and makes sense

KFrank · April 11, 2022, 3:15pm

Hi Maxwell!

For what it’s worth, you can do this with pytorch’s bit-manipulation
routines, without converting to numpy.

Here is a pure-pytorch solution, similar in concept to Matias’s approach:

import torch
print (torch.__version__)

_ = torch.manual_seed (2022)

def weird_format (t10):
    tbits = (t10.unsqueeze (1).bitwise_right_shift (torch.arange (10, dtype = torch.int16).unsqueeze (1)) % 2 == 1).to (torch.int32)
    t32 = tbits @ torch.ones (1, dtype = torch.int32).bitwise_left_shift (torch.arange (32, dtype = torch.int32))
    return  t32

t10a = torch.zeros (1, 32, dtype = torch.int16)
t10a[:, 0:3] = torch.tensor ([[42, 12, 48]])
t32a = weird_format (t10a)
print ('t10a:')
print (t10a)
print ('t32a:')
print (t32a)

nBatch = 3
t10b = torch.randint (2**10 - 1, (nBatch, 32), dtype = torch.int16)
t32b = weird_format (t10b)
print ('nBatch:', nBatch)
print ('t10b:')
print (t10b)
print ('t32b:')
print (t32b)

And here is its output:

1.11.0
t10a:
tensor([[42, 12, 48,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
          0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0]],
       dtype=torch.int16)
t32a:
tensor([[0, 1, 2, 3, 4, 5, 0, 0, 0, 0]], dtype=torch.int32)
nBatch: 3
t10b:
tensor([[ 248,  511,  315,  780,  559,  775,  131,  145,   10,  366,  692,  144,
          716,  389,  346,  421,  576,  404,  797,  959,  541,   44,   81,  869,
          861,  660,  792,  368,  255,  690,  306,  799],
        [ 979,  904,  552,  171,  373,  959,   24,  923,  335,  276,  979,  972,
          827,   16,  645,  669,  911,  418,  345,  727,  947,  351,  339,   13,
           83,   81,  581,  927,  972,  668,  357,  126],
        [ 711,   19,  854,   91,   30,  498,  767,  153,  873,  933,  549,  531,
          415,   83,  659,  373,  360,  529,  410,  710,  690,  188,  766,  686,
          298,  538,  786, 1010,  675,  881,  652,  646]], dtype=torch.int16)
t32b:
tensor([[-1847811850,  -267893898, -1816218054, -1791208673,   -10597241,
          2024310295,   432099843,   839564483,  -846274002, -1482877896],
        [ 1342035385, -1988422231,   -55981264, -1197106706, -1417890063,
         -1072558020,  -680784623,   941345963,  1484201907,  1008327847],
        [  805502923, -1612941185, -1058498987,  1139085784,   779548926,
           972130144,   675914093,  -654552351,   755340068,   -19247291]],
       dtype=torch.int32)

If all you care about are the bits, you’re good. But to echo @ptrblck 's
comment, if your original [1, 32] tensor contains meaningful numbers,
your resulting [1, 10] tensor will no longer contain meaningful numbers
(except in garbled form).

Best.

K. Frank

Maxwell_Albert · April 13, 2022, 1:56am

Thanks, it gives me inspiration

Maxwell_Albert · April 13, 2022, 1:57am

Thanks, I hope I could give two solutions .