Parsing meta.bin for Imagenet in torchvision

paganpasta · June 7, 2020, 10:23am

The ImageNet class allows to load the data from the image directory. As we know, we need to download the data before hand as it is not longer supported.

My concern is regarding parsing the meta.bin file which is supposed to be in devkit folder. Let’s focus on classification task. Therefore, devkit/data/ILVSRC2015_clsloc_validation_ground_truth.txt and devkit/data/meta_clsloc.mat are firstly needed to be renamed in order to be read successfully.

The function parse_meta is unable to parse the data still. I found that the below modifications help.


def parse_meta(devkit_root, path='data', filename='meta.mat'):
    import scipy.io as sio

    metafile = os.path.join(devkit_root, path, filename)
    meta = sio.loadmat(metafile, squeeze_me=True)['synsets']
    # nums_children = list(zip(*meta))[4]
    # meta = [meta[idx] for idx, num_children in enumerate(nums_children)
    #         if num_children == 0]
    idcs, wnids, classes = list(zip(*meta))[:3]
    classes = [tuple(clss.split(', ')) for clss in classes]
    idx_to_wnid = {idx: wnid for idx, wnid in zip(idcs, wnids)}
    wnid_to_classes = {wnid: clss for wniddef parse_meta(devkit_root, path='data', filename='meta.mat'):

For example, meta just after reading from filename appears like [(1, n02119789, 'kit fox, vulpis macrotis', 'some description', 1300),(...),...] . I am not sure what is the purpose of the lines I have commented. uncommented version results in meta = [ ].
Can someone confirm if this is the right approach? Or am I looking into wrong meta files. I downloaded the dataset from academic torrent. Pytorch version: 1.3.0
Below is a sample snippet to replicate the problem.

from torchvision.datasets.imagenet import ImageNet, parse_devkit
class CustomImgNet(ImageNet):
    def __init__(self, root, **kwargs):
        self.root = root
        meta = parse_devkit(root+'/devkit')
        self._save_meta_file(*meta)
        super().__init__(root+'/data', **kwargs)

    def __getitem__(self, item):
        img, y = super().__getitem__(item)

        return img, y


if __name__ == '__main__':
    from torch.utils.data.dataloader import DataLoader
    from torchvision.transforms import transforms

    val_transform = transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406],
                             std=[0.229, 0.224, 0.225])
    ])
    dataset = ImageNet16(root='/path/to/data', download=False, split='val', transform=val_transform)