Hello! Is it possible to load an 8 bit unsigned integer as a 8 bit float between 0~1(?)(if it exists).

I have a data that is inherently an 8bit unsigned integer (0~255), but I want to normalize it to 0~1 before performing the forward pass. I guess there would be two ways to do this :

1. since torch tensor seems to support 8 bit unsigned integers, load the 8 bit unsigned integer to the gpu then normalize it inside of the GPU
2. normalize the 8bit integer while inside the CPU tensor (convert it to float) then load that to the GPU

I have a question about the second option. It seems that there isnâ€™t anything like â€ś8 bit integerâ€ť or equivalent in float. Is there? Is there a way to convert the 8 bit integer into an 8 bit float tensor (so that the precision is preserved?)

I am asking because I would prefer to do 2. instead of 1. because that would make the code cleaner!

(sorry if my questions are a bit basic )

The second approach is the common one and an `uint8` image tensor will be normalized to a `float32` tensor in the range `[0, 1]` using `torchvision.transforms.ToTensor()` as seen in this example:

``````# load uint8 image
img = PIL.Image.open(PATH)
# or generate random image
img = transforms.ToPILImage()(torch.randn(3, 224, 224))
print(np.array(img).min(), np.array(img).max())
# 0 255

out = transforms.ToTensor()(img)
print(out.min(), out.max())
# tensor(0.) tensor(1.)
``````

8bit integers, i.e. images using the `uint8` data type, can be mapped to `float32` without a loss in precision, since `float32` can represent all integers up to `2**24` where a rounding to multiple of 2s starts:

``````torch.tensor(2.**24)
# tensor(16777216.)
torch.tensor(2.**24+1)
# tensor(16777216.)
torch.tensor(2.**24+2)
# tensor(16777218.)
``````

in case you want to keep the original integer values. If not, you can also verify that a â€śreverseâ€ť mapping is possible:

``````y = out * 255
(y == torch.from_numpy(np.array(img)).permute(2, 0, 1)).all()
# tensor(true)
``````
1 Like

Yes, but also all model parameters are stored in `float32` and the gradient calculation needs floating point values. You could use mixed-precision training as described here which would allow you to use `float16` or `bfloat16` data types, where itâ€™s considered to be safe.
Lower numerical formats are currently being developed and e.g. `TransformerEngine` allows you to use some Transformer-specific modules in `FP8` format on the latest NVIDIA Hopper GPU architecture.