Hello, everybody!
I have recently downloaded images from ImageNet to try to throw some networks at. But I have run into a problem. The training images come with classifications - a total of 200 in the ‘tiny’ download - that’s all well and good. But the test and validation images both are missing classifications. When I try to run the DataLoader batch-divider on those two data sets the same way I do for the training data, the classifications come up as 0 every time… I have two screenshots attached. One is a screenshot of what the training data download looks like - with each folder being a set of images and each text file corresponding to the classifications. I also attached a screenshot of what the validation and testing data set downloads look like - they are definitely different and I don’t know how to get my hands on their classifications. Could anyone give me some guidance on what I am missing? Many thanks! EDIT - it looks like I am only allowed one media per post as a new user - I opted to include the Validation and Testing Dataset download screenshot. Please let me know if that is not enough information!
What is the directory structure of images
in each case? It’s common practice to organize classes by directory for classification datasets.
Thanks for the reply! Each of the ‘images’ folders is a collection of JPEG images. In the mean time, I’ve made do with taking the downloaded training data and creating a subset of that data to be the testing data. If it helps, here is the screenshot of what each of those ‘images’ folders looks like on the inside. For the val_data, I realize now that I do have a ‘.txt’ file which contains the classifications. However, for the train_data, I still do not - only a bunch of JPEG images.
Generally, I have my train or validation folder with multiple subfolders and their name specifying the class of the images in the subfolder. However, the Imagenet dataset we obtain needs some preprocessing to achieve this structure. I have not used the tiny-imagenet dataset, but I followed this blog https://www.adeveloperdiary.com/data-science/computer-vision/how-to-prepare-imagenet-dataset-for-image-classification/ for the Imagenet dataset preprocessing. I hope this helps you to some extent.
Sorry, does train
have the same flat structure that test
does or is it hierarchical corresponding to class structure? A quick look at a tiny-imagenet tutorial suggests that it is already organized this way:
https://towardsdatascience.com/pytorch-ignite-classifying-tiny-imagenet-with-efficientnet-e5b1768e5e8f