v2.ToImage now working with ndarray

h-0-0 · April 24, 2024, 6:53pm

Hi, I’ve come across an issue where when I try and apply ToImage to a ndarray of the general form (number of samples, number of channels, height, width) ie. an array of greyscale images I get an error saying that ‘number of dimensions in the tensor input does not match the length of the desired ordering of dimensions i.e. input.dim() = 4 is not equal to len(dims) = 3’.

The following code should reproduce the error:

import numpy as np 
import torch
from torchvision.transforms import v2

n_samples = 100
n_channels = 1
height = 20
width = 20
# Create an np array with random entries of size (n_samples, n_channels, height, width)
X = np.random.randn(n_samples, n_channels, height, width)
# Create transform that will be applied to the data
trans = v2.Compose([v2.ToImage(), v2.ToDtype(torch.float32, scale=True)])
# Apply the transform to the data
X_torch = trans(X)

It works if I apply the transform separately in the array, but I was under the impression (from the documentation) that you should be able to apply it to a batch of images.

h-0-0 · April 25, 2024, 9:40am

Found the issue. It assumes the ndarray has format (samples, height, width, channels), if given in this format it works fine.