I noticed that torch.float16’ s smallest positive number 2**(-24). However, IEEE 754 set the smallest non-zero positive number as 2**(-14). (when not considering denormal representation)
How does torch.float16 work? Is it different from IEEE754 ? Do they apply different exponent bias (a.k.a. zero offset) according to the tensor’s distribution?
The below code shows what I have tried:
x = torch.tensor((2**(-24)*1.))
x
tensor(5.9605e-08)
x.half()
tensor(5.9605e-08, dtype=torch.float16)