The ImageNet class allows to load the data from the image directory. As we know, we need to download the data before hand as it is not longer supported.
My concern is regarding parsing the meta.bin
file which is supposed to be in devkit
folder. Let’s focus on classification task. Therefore, devkit/data/ILVSRC2015_clsloc_validation_ground_truth.txt
and devkit/data/meta_clsloc.mat
are firstly needed to be renamed in order to be read successfully.
The function parse_meta
is unable to parse the data still. I found that the below modifications help.
def parse_meta(devkit_root, path='data', filename='meta.mat'):
import scipy.io as sio
metafile = os.path.join(devkit_root, path, filename)
meta = sio.loadmat(metafile, squeeze_me=True)['synsets']
# nums_children = list(zip(*meta))[4]
# meta = [meta[idx] for idx, num_children in enumerate(nums_children)
# if num_children == 0]
idcs, wnids, classes = list(zip(*meta))[:3]
classes = [tuple(clss.split(', ')) for clss in classes]
idx_to_wnid = {idx: wnid for idx, wnid in zip(idcs, wnids)}
wnid_to_classes = {wnid: clss for wniddef parse_meta(devkit_root, path='data', filename='meta.mat'):
For example, meta
just after reading from filename appears like [(1, n02119789, 'kit fox, vulpis macrotis', 'some description', 1300),(...),...]
. I am not sure what is the purpose of the lines I have commented. uncommented version results in meta = [ ]
.
Can someone confirm if this is the right approach? Or am I looking into wrong meta files. I downloaded the dataset from academic torrent. Pytorch version: 1.3.0
Below is a sample snippet to replicate the problem.
from torchvision.datasets.imagenet import ImageNet, parse_devkit
class CustomImgNet(ImageNet):
def __init__(self, root, **kwargs):
self.root = root
meta = parse_devkit(root+'/devkit')
self._save_meta_file(*meta)
super().__init__(root+'/data', **kwargs)
def __getitem__(self, item):
img, y = super().__getitem__(item)
return img, y
if __name__ == '__main__':
from torch.utils.data.dataloader import DataLoader
from torchvision.transforms import transforms
val_transform = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
dataset = ImageNet16(root='/path/to/data', download=False, split='val', transform=val_transform)