How do I replicate the exact preprocessing for Imagenet 2012 validation?

Using the default transformation which is provided here:

transform = transforms.Compose([
    transforms.RandomSizedCrop(224),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize(mean = [ 0.485, 0.456, 0.406 ],
                         std = [ 0.229, 0.224, 0.225 ]),
])

Source: https://github.com/pytorch/examples/blob/27e2a46c1d1505324032b1d94fc6ce24d5b67e97/imagenet/main.py#L48-L62

I am unable to reproduce the validation accuracy for inception V3. In fact, the accuracy is lower by 6%. The problem lies mainly in the cropping, as the 78.0% accuracy reported by Google uses central crop with a specific proportion of 0.875, before resizing the image with bilinear interpolation. I am able to reproduce Google’s result with TensorFlow but not with Pytorch. I have checked that the central cropping transformation in pytorch modules require a specific size to central crop, but not a fraction. is there a way I could deal with fractions instead?

Specific transformation from TF: https://github.com/tensorflow/models/blob/master/research/slim/preprocessing/inception_preprocessing.py#L243-L281

Note that I think the transformation is at fault here, not the model itself, since after changing to central crop I can get 76% accuracy, which is close but not close enough for reproducibility reasons. I’m very new to PyTorch (first day using it) so would be great if someone could share if there are any existing tools to do this already.

That said, pytorch really makes life easy for us researchers and it’s a wonderful tool. Thanks a lot! :smiley:

Unfortunately it seems that only size is supported at the time. However, since size is computable from the proportion, you can do that :slight_smile:

Check the updated Imagenet main.py here

    val_loader = torch.utils.data.DataLoader(
        datasets.ImageFolder(valdir, transforms.Compose([
            transforms.Resize(256),
            transforms.CenterCrop(224),
            transforms.ToTensor(),
            normalize,
        ])),

Here it’s doing the central crop instead of a random crop as in the code you posted.

However, If you want to control better the cropping, you can do your custom transformation as in here:

http://pytorch.org/tutorials/beginner/data_loading_tutorial.html#transforms

Thanks for the tip. I have made some changes and submitted a pull request here: https://github.com/pytorch/vision/pull/429

2 Likes