I encountered some weird behaviors in the torchvision library. Whenever I use the CIFAR-10 dataset and a model (e.g., resnets) both from torchvision, they seem to work fine.
However, if I swap the CIFAR-10 dataset with a jpeg version and loaded it using the torchvision ImageFolder, somehow the network cannot seem to train it properly. Furthermore, the weights after training cannot output the same accuracy found during training.
All of the settings are the same including the transforms in the dataloader.
(The last result (8.35…) is actually the training set, which is different from the previous training loss)
I am at a loss. If I use the ImageFolder function with my own network, the training problem does not appear. Is this a problem with ImageFolder and preset torchvision networks or am I missing something?
I think the problem here, is that you don’t apply the same preprocessing (mostly normalization) to your images as the one done by torchvision, which is what the network expects.
Note also that your model is almost giving random guesses, if its accuracy score is 10%