I retrained SqueezeNet with greyscale 128 by 128 images, only using the
torchvision.transforms.ToTensor() transform, via
torchvision.datasets.ImageFolder(). It yielded no error and training went fine, although the original network should only accept RGB 224 by 224 images.
However, when I try to load a single image through
PIL in order to classify it with the model I trained previously, I get a dimension mismatch error (
Invalid dimensions for image data). How does exactly
torchvision.datasets.ImageFolder() does in order to accommodate my images into the preexisting architecture? How can I reproduce the same transformations
torchvision.datasets.ImageFolder() does to data, in order to have a successful isolated classification?
Edit: I figured out, going through
torchvision.datasets.ImageFolder() code, that by default,
PIL is used to load the images and it is automatically converted to RGB, by using
img = Image.open(f) and
img.convert('RGB'). However, I still do not know how the difference between the spatial resolutions is handled.