When you use torchvision.io.decode_image(..., mode=ImageReadMode.UNCHANGED)
, if the image has premultiplied alpha (ie. NRGBA in go’s image/color) does it automatically convert to non-premultiplied alpha?
I believe you question is basically to know how images with premultiplied alpha is handled during image decoding.
When you do torchvision.io.decode_image('/path/to/image', mode='UNCHANGED')
, PyTorch ensures that it loads the image as is, i.e. the data and channels are preserved. Hence, if you load an image with pre-multiplied alpha, it won’t be automatically converted into non-pre-multiplied alpha.
For example,
# Loading images
img_nrgba = decode_image("./data/example_without_premup_alpha.png", mode='UNCHANGED')
img_rgba = decode_image("./data/example_with_alpha.png", mode='UNCHANGED')
TLDR:
Images with pre-multiplied alpha are not converted to non-pre-multiplied alpha automatically.
Is there a way I can tell whether the image used premultiplied alpha with torchvision so that I can normalize my inputs to be of the same format? Also will setting mode=RGB force non-premultiplied?
Short answer, you cannot.
Since pre-multiplied alpha cannot be reliably identified in PNGs, most datasets provide images with non-premultiplied-alpha, where the alpha channel is simply the fourth channel in the RGBA PNG image.
Hence, if you load your image with decode_image(..., mode=UNCHANGED)
, and analyze the loaded tensor’s shape, an original RGB image will have 3 channels, and an original RGBA image (non-pre-multiplied-alpha) will have 4 channels.
With that said, setting mode=RGB
for a non-pre-multiplied-alpha image would give you a RGB image, with the alpha channel discarded.