Implementation of numpy function .view(uint8)

I’m looking to get the byte-decomposed values of Float32 tensor on GPU. Would anyone know how to do this? In numpy, this would be:

float_nums = [0.16474093, -0.06143471,  0.09829687]
float_arr = np.array(float_nums, dtype=np.float32)
uint8_arr = float_nums.view(np.uint8)
# uint8_arr is now 4 times the length float_arr 
#   and I can perform various bit operations, 
#   like bit masking the float mantissa, etc.

PyTorch’s bit operators (^/&/etc.) require ByteTensors - this is my underlying use-case.

Unfortunately, the .byte() function doesn’t re-interpret the same binary bytes, instead uses the floating point value to round and wraps around to nearest byte value. For example, float_arr.byte(), in the above snippet, would result in an all-zeroes tensor.

I could interconvert to numpy and back, but these Tensors are on GPU and shuffling all these tensors back and forth to main-memory is wasteful and begins to dominate my inference and training time.

Any suggestions on how to achieve this same functionality would be appreciated.



I’m afraid there is no way to do this at the moment.
Would a reinterpret function would be an interesting feature @smth ?

Thanks for the quick response. If this is the case, I might look into writing a CUDA extension for this little bit manipulation and the feasibility of doing the re-interpreting inside a custom kernel.

If so desired, I could look into extending this to a more general reinterpret bytes to submit upstream.

@albanD yes a reinterpret function would be nice, though not sure how far the rabbit-hole goes in implementing it :slight_smile:


Yes a simple cuda extension with the new cpp extension should be very simple to do. And the easiest way to do this.

@smth There is definitly no api that lets you do that easily. I’ll take a look when I have a bit of free time and I’m done with the hook thing.