I downloaded the imagenet ILSVRC2012 dataset, but I can’t understand how to prepare it for training.
I have:
-
train folder with 1000 folders (folder per class) with arbitrary name (n01440764, n01443537, …)
-
val folder with 50,000 JPEG images with arbitrary name (ILSVRC2012_val_00000001.JPEG, ILSVRC2012_val_00000002.JPEG, …)
-
val labels file - ILSVRC2012_validation_ground_truth.txt
Is there any common/easy way to do it with Pytorch?
Using ImageNet Class
If you downloaded the ILSVRC2012 tarball, you can use the torchvision.datasets.ImageNet
class.
Directory structure:
ILSVRC2012
├── ILSVRC2012_img_train.tar
├── ILSVRC2012_img_val.tar
└── ILSVRC2012_devkit_t12.tar.gz
Code:
from torchvision.datasets import ImageNet
dataset = ImageNet(root='ILSVRC2012', split="train", transform=None) # {train|val}
The ImageNet class will automatically extract the tarball and load the data.
Reference: ImageNet | Torchvision documentation
Using ImageFolder Class
If you have any directory tree with the ImageNet structure like:
data
├── class_1
│ ├── img_1_1
│ └── img_1_2
├── class_2
│ ├── img_2_1
│ └── img_2_2
├── ...
└── class_n
├── img_n_1
└── img_n_2
Then you can use the ImageFolder class to load this directory tree as a dataset:
from torchvision.datasets import ImageFolder
dataset = ImageFolder(root='data', transform=None)
Reference: ImageFolder | Torchvision documentation