Here is a question about how to do bit-wise change to pytorch tensor.
I have a tensor with size [1,32] and type int16. But only 10 bits of 16 are useful.(since values are too small).
And I want to change it bit-wise to tensor with size [1,10] and type int32, which means the data or the binary value are the same.
How could I do this easily?
It seems you would like to “pack” the tensor somehow and store it as the int32
dtype.
In that case, you might want to write a custom C++ extension and perform the bit manipulations there.
Note that the int32
values would of course not represent the real values since you are not only packing but also interleaving the actual 10bit data.
Thank you!
Is there any tutorial or related notes about tensor extension I could refer to?
I think this should do the trick.
def weird_format(tensor):
#assert tensor.shape == torch.Size([1, 32])
temp = tensor.numpy()
temp = np.unpackbits(temp.view(np.uint8), axis=0)
m_s = temp[:,1::2] # Most significant bits
l_s = temp[:,::2] # Least significatn bits
temp = np.vstack((m_s, l_s))[-10:, :]
print(temp) # To see the binary representation (y axis)
temp = np.packbits(temp[:, ::-1]).view(np.int32)
temp = torch.tensor(temp).unsqueeze(0)
print(temp.T) # To see what the 'new' numbers are (taking 32 bits on the x axis)
return temp
# Test
tensor = torch.arange(32, dtype=torch.int16).unsqueeze(0)
print(weird_format(tensor).shape)
# Output
# Up to down are the 10 least significant bits of the input 0-31 in binary
# Left to right are the 32 bits taken for the new numbers
[[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
[0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1]
[0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1]
[0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1]
[0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1]]
# If you put these numbers into a binary converter you get the values from above
tensor([[ 0],
[ 0],
[ 0],
[ 0],
[ 0],
[ 65535],
[ 16711935],
[ -252645136],
[ -858993460],
[-1431655766]], dtype=torch.int32)
# This is the shape that you wanted
torch.Size([1, 10])
Hope this helps and makes sense
Hi Maxwell!
For what it’s worth, you can do this with pytorch’s bit-manipulation
routines, without converting to numpy.
Here is a pure-pytorch solution, similar in concept to Matias’s approach:
import torch
print (torch.__version__)
_ = torch.manual_seed (2022)
def weird_format (t10):
tbits = (t10.unsqueeze (1).bitwise_right_shift (torch.arange (10, dtype = torch.int16).unsqueeze (1)) % 2 == 1).to (torch.int32)
t32 = tbits @ torch.ones (1, dtype = torch.int32).bitwise_left_shift (torch.arange (32, dtype = torch.int32))
return t32
t10a = torch.zeros (1, 32, dtype = torch.int16)
t10a[:, 0:3] = torch.tensor ([[42, 12, 48]])
t32a = weird_format (t10a)
print ('t10a:')
print (t10a)
print ('t32a:')
print (t32a)
nBatch = 3
t10b = torch.randint (2**10 - 1, (nBatch, 32), dtype = torch.int16)
t32b = weird_format (t10b)
print ('nBatch:', nBatch)
print ('t10b:')
print (t10b)
print ('t32b:')
print (t32b)
And here is its output:
1.11.0
t10a:
tensor([[42, 12, 48, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]],
dtype=torch.int16)
t32a:
tensor([[0, 1, 2, 3, 4, 5, 0, 0, 0, 0]], dtype=torch.int32)
nBatch: 3
t10b:
tensor([[ 248, 511, 315, 780, 559, 775, 131, 145, 10, 366, 692, 144,
716, 389, 346, 421, 576, 404, 797, 959, 541, 44, 81, 869,
861, 660, 792, 368, 255, 690, 306, 799],
[ 979, 904, 552, 171, 373, 959, 24, 923, 335, 276, 979, 972,
827, 16, 645, 669, 911, 418, 345, 727, 947, 351, 339, 13,
83, 81, 581, 927, 972, 668, 357, 126],
[ 711, 19, 854, 91, 30, 498, 767, 153, 873, 933, 549, 531,
415, 83, 659, 373, 360, 529, 410, 710, 690, 188, 766, 686,
298, 538, 786, 1010, 675, 881, 652, 646]], dtype=torch.int16)
t32b:
tensor([[-1847811850, -267893898, -1816218054, -1791208673, -10597241,
2024310295, 432099843, 839564483, -846274002, -1482877896],
[ 1342035385, -1988422231, -55981264, -1197106706, -1417890063,
-1072558020, -680784623, 941345963, 1484201907, 1008327847],
[ 805502923, -1612941185, -1058498987, 1139085784, 779548926,
972130144, 675914093, -654552351, 755340068, -19247291]],
dtype=torch.int32)
If all you care about are the bits, you’re good. But to echo @ptrblck 's
comment, if your original [1, 32]
tensor contains meaningful numbers,
your resulting [1, 10]
tensor will no longer contain meaningful numbers
(except in garbled form).
Best.
K. Frank
Thanks, it gives me inspiration
Thanks, I hope I could give two solutions .