Floating point images in torchvision

fdan · May 28, 2021, 10:49pm

Hi folks,

TorchVision noob question.

I’m doing some research using PyTorch in animation and visual effects, where the standard image format is floating point exr files (lossless, high dynaimc range). From some cursory reading, it seems that TorchVision utilises PIL, which I believe is limited to 8bit int images?

I’m using another library for handling of exr files (OpenImageIO), from which I can load a floating point image, perform some operations such as resize etc, dump into a NumPy array, and from there initialise a torch Tensor.

My question is, is this equivalent functionality to TorchVision? i.e. in my use case can I just load image data into tensors without using TorchVision?

cheers

eqy · May 29, 2021, 12:09am

torchvision really only uses 8-bit RGB as an intermediate format, as images are usually converted to floating point before being passed to a model with something like to_tensor().

You can also see a typical example of how transformations are applied before being passed to a model in the ImageNet example:

github.com

pytorch/examples/blob/cbb760d5e50a03df667cdc32a61f75ac28e11cbf/imagenet/main.py#L207


        print("=> no checkpoint found at '{}'".format(args.resume))


cudnn.benchmark = True


# Data loading code
traindir = os.path.join(args.data, 'train')
valdir = os.path.join(args.data, 'val')
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])


train_dataset = datasets.ImageFolder(
    traindir,
    transforms.Compose([
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        normalize,
    ]))


if args.distributed:
    train_sampler = torch.utils.data.distributed.DistributedSampler(train_dataset)

Loading a floating point image should not be an issue. If for some reason you need to use a transform that only works on PIL Images, you can use the to_pil_image function.