Bit width of image

how can i know my image pixel bit width ?

It is same as the number channels of the image * 8.

As @Kushaj said, True color RGB images will use a bit depth of 24 (8 for each channel).
However, your images can of course come from another domain, which might use another bit depth value.
E.g. depth images often use a single channel with 16 bits, while Dicom images might use 12 bits.

If you can load the image via PIL, you could check the image.depth or image.mode attributes.

1 Like

But my maximum and minimum value of my image pixel are 151.47 and -119.675,basically if we keep 1bit for signbit and again we require more than 8bit for integer bit itself
there is a bitoverflow here

If you max and min values are 151.47 and -119.675, respectively, your image would be encoded in a floating point format, not a 8bit integer.
I don’t fully understand the use case. What did image.depth or image.mode return?

Sorry @ptrblck for very late reply,I am getting mode as RGB ,and for depth I am getting AttributeError: ‘JpegImageFile’ object has no attribute 'depth’

It’s a bit strange, that Jpeg images in RGB return floating point values, as I would assume they are encoded using uint8. Could you post the results of:

img = # load image
a = np.array(img)
print(np.max(a), np.min(a), a.dtype)
img = Image.open("/content/XNOR-Net-PyTorch/ImageNet/networks/data/val/n01440764/ILSVRC2012_val_00002138.JPEG")

    a = np.array(img)

    print(np.max(a),np.min(a),a.dtype) -> (255, 0, dtype('uint8'))

Thanks for the update.
How do these min and max values fit to your previously reported values of 151.47 and -119.675?

my image undergoes transforms,so I guess from that i am getting those values

val_loader = torch.utils.data.DataLoader(
                datasets.ImageFolder(valdir, transforms.Compose([
                    transforms.Resize((256, 256)),
                    transforms.CenterCrop(input_size),
                    transforms.ToTensor(),
                    normalize,
                    ])),
                batch_size=args.batch_size, shuffle=False,
                num_workers=args.workers, pin_memory=True)```

ToTensor normalizes unit8 images to the range [0, 1], so the normalize transformation would need to use a really small std to blow these values up again.

Anyway, as you can see from your previous post, the bit width is 8bit for your images: dtype('uint8').

  • as ’ uint8 suggest all pixel values must be positive right? but after undergoing transforms my input has a negative value

transforms.Normalize subtracts the mean and divides by the std so negative and positive values are expected. However, they are usually smaller in magnitude than your reported values, so feel free to post a full code snippet to reproduce these values.

  • here is the code where i am getting input
def validate(val_loader, model, criterion):
    batch_time = AverageMeter()
    losses = AverageMeter()
    top1 = AverageMeter()
    top5 = AverageMeter()

    # switch to evaluate mode
    model.eval()

    end = time.time()
    bin_op.binarization()
    for i, (input, target) in enumerate(val_loader):
        target = target.cuda(async=True)
        with torch.no_grad():
            input_var = torch.autograd.Variable(input)
            target_var = torch.autograd.Variable(target)
            print(input_var)
 

        # compute output
        output = model(input_var)
        loss = criterion(output, target_var)```

* this is how my input sample look like,I am getting **floating point** as input altough it is in encoded in **uint8**

* So i round it off to nearest integer as to get 8 bit value

print(input_var)

```27.325001 13.325003 -56.674995 -62.674999 -66.675003 -70.675003
    -64.675003 -57.674995 -18.675001 18.325003 -2.675003 3.324996
    12.325003 16.325003 25.325001 -14.675002 1.324996 -12.675002
    -27.674999 -38.674999 -23.675001 -23.675001 -37.674999 -16.675001
    57.324997 -46.674999 -72.675003 -74.675003 -78.675003 -102.674995
    -98.675003 -81.675003 19.325003 104.324989 101.324989 125.324989
    125.324989 120.324989 125.324989 128.324982 127.324989 128.324982
    125.324989 121.324989 123.324989 126.324989 126.324989 108.324989
    110.324989 128.324982 128.324982 129.324982 129.324982 129.324982
    129.324982 127.324989 129.324982 127.324989 122.324989 125.324989
    129.324982 129.324982 127.324989 127.324989 129.324982 128.324982
    129.324982 129.324982 130.324982 130.324982 128.324982 127.324989
    128.324982 126.324989 128.324982 129.324982 128.324982 128.324982
    127.324989 128.324982 128.324982 130.324982 129.324982 130.324982
    130.324982 131.324982 131.324982 129.324982 129.324982 129.324982
    129.324982 129.324982 129.324982 129.324982 127.324989 128.324982
    128.324982 117.324989 14.325003 15.325003 21.325001 51.324997
    -5.675003 -97.675003 -96.675003 -94.675003 -93.675003 -81.675003
    60.324997 121.324989 119.324989 119.324989 121.324989 125.324989
    125.324989 120.324989 119.324989 127.324989 128.324982 127.324989
    118.324989 2.324996 -84.675003 -90.675003 -95.675003 -96.675003
    -97.675003 -68.675003 33.325001 -0.675003 59.324997 103.324989```

What if it’s NOT uint8, BUT uint16 ?
Will ToTensor() normalize uint16 images to [0,1] also?

No, I don’t think so as seen here:

a = np.random.randint(0, 65535, (224, 224, 3), dtype=np.uint16)
img = PIL.Image.fromarray(a, 'I;16')

transform = transforms.ToTensor()
out = transform(img)
print(out.min(), out.max())
# tensor(-32767, dtype=torch.int16) tensor(32767, dtype=torch.int16)

As you can see, it’ll also transform the data to int16 here as uint16 is an unsupported tensor type.

CC @fmassa @vfdev-5 @pmeier
I guess this is intentional due to the lack of uint16 support in torch.from_numpy?

@ptrblck what is the reason that torch does not support uint16?

I would guess uint16 wasn’t added as a dtype because the demand for it might not be that huge since you could use e.g. int32 with a memory overhead.
Since these integer types are also used for inputs mainly (the model won’t be able to train with these dtypes directly) the memory overhead might also be negligible.
Adding a new dtype would need to be piped through all methods increasing the binary size significantly, which is another argument against adding new types unless their use cases justify it.

1 Like

Makes sense.

Just for your information: probably a niche domain, but 16bit grayscale images are commonly used in microscopy imaging.

Yes, I agree that 16bit image formats are used, but do you think a native dtype would be still needed to be added via uint16 to PyTorch?
I would assume the common approach would be to load these images (e.g. via PIL, OpenCV, or another image library which supports this format), and transform it to a floating point format tensor (float32 by default) which is then used to train the model.
Which operations would you like to apply directly on the uint16 image in PyTorch?