Hi everybody,
I retrained SqueezeNet with greyscale 128 by 128 images, only using the torchvision.transforms.ToTensor()
transform, via torchvision.datasets.ImageFolder()
. It yielded no error and training went fine, although the original network should only accept RGB 224 by 224 images.
However, when I try to load a single image through PIL
in order to classify it with the model I trained previously, I get a dimension mismatch error (Invalid dimensions for image data
). How does exactly torchvision.datasets.ImageFolder()
does in order to accommodate my images into the preexisting architecture? How can I reproduce the same transformations torchvision.datasets.ImageFolder()
does to data, in order to have a successful isolated classification?
Edit: I figured out, going through torchvision.datasets.ImageFolder()
code, that by default, PIL
is used to load the images and it is automatically converted to RGB, by using img = Image.open(f)
and img.convert('RGB')
. However, I still do not know how the difference between the spatial resolutions is handled.