Reinterpret casting funny business with BDD100k

BDD100k has its panoptic annotations in an RGBA packed format described here. I’m trying to cast this to a int32 so that I can use this to find unique labels for mask generation. However, when trying to bitwise extract the annotation data, I am getting bad category_ids (over the max category). If I check the raw “R” channel without loading, all the class ids seem to be valid, but when trying to extract from the upper byte of the int32, I get invalid classes. I would presume ideally I would cast to uint32, but there is no torch.uint32.

self.image_data = (
    torch.from_numpy(
        np.array(Image.open(data_path).convert("RGBA"), dtype=np.uint8)
    )
    .view(dtype=torch.int32)
    .permute(2, 0, 1)
)
...
for mask in self.image_data.unique():
    category_id = seg >> 24
    truncated = seg & 2 ** 19
    occluded = seg & 2 ** 18
    iscrowd = seg & 2 ** 17
    ignore = seg & 2 ** 16
    instance_id = seg & 0xFFFF

Good ol’ little endian results in the reversal of the byte ordering when unpacking from the bitcasting. This seems to be good for it now.

instance_id = (seg >> 16 & 0xFF00) + seg >> 24
truncated = seg & 2 ** 11
occluded = seg & 2 ** 10
iscrowd = seg & 2 ** 9
ignore = seg & 2 ** 8
category_id = seg & 0xFF