ResNet with CIFAR10 only reaches 86% accuracy (expecting >90%)

etrommer · October 25, 2021, 3:07pm

Hello everyone,

I am trying to reproduce the numbers from the original ResNet publication on CIFAR10. I am using the network implementation from here:

github.com

akamaster/pytorch_resnet_cifar10/blob/master/resnet.py

'''
Properly implemented ResNet-s for CIFAR10 as described in paper [1].

The implementation and structure of this file is hugely influenced by [2]
which is implemented for ImageNet and doesn't have option A for identity.
Moreover, most of the implementations on the web is copy-paste from
torchvision's resnet and has wrong number of params.

Proper ResNet-s for CIFAR10 (for fair comparision and etc.) has following
number of layers and parameters:

name      | layers | params
ResNet20  |    20  | 0.27M
ResNet32  |    32  | 0.46M
ResNet44  |    44  | 0.66M
ResNet56  |    56  | 0.85M
ResNet110 |   110  |  1.7M
ResNet1202|  1202  | 19.4m

which this implementation indeed has.

This file has been truncated. show original

As far as I can tell, I am using the exact training parameters that are given in the paper:

We use a weight decay of 0.0001 and momentum of 0.9,
and adopt the weight initialization in [13] and BN [16] but
with no dropout. These models are trained with a mini-
batch size of 128 on two GPUs. We start with a learning
rate of 0.1, divide it by 10 at 32k and 48k iterations, and
terminate training at 64k iterations, which is determined on
a 45k/5k train/val split. We follow the simple data augmen-
tation in [24] for training: 4 pixels are padded on each side,
and a 32×32 crop is randomly sampled from the padded
image or its horizontal flip.

Here are the relevant parts of my training script:

class ResNet(pl.LightningModule):
    def __init__(self):
        super().__init__()
        self.model = resnet.resnet20()

    def forward(self, x): 
        return self.model(x)

    def configure_optimizers(self):
        optimizer = optim.SGD(self.parameters(), lr=0.1, momentum=0.9, weight_decay=1e-4)
        scheduler = optim.lr_scheduler.MultiStepLR(optimizer, milestones=[90,135])
        return {'optimizer': optimizer, 'scheduler': scheduler}
...

class CIFAR10DataModule(pl.LightningDataModule):
    def __init__(self, **kwargs):
        ...
        self.normalize = [
            transforms.ToTensor(),
            transforms.Normalize(mean=[0.491, 0.482, 0.447], std=[0.247, 0.243, 0.262]),
        ]
        self.augment = [
            transforms.RandomCrop(32, padding=4),
            transforms.RandomHorizontalFlip(),
        ]   
            
    def prepare_data(self):
        torchvision.datasets.CIFAR10(root=self.data_dir, train=True, download=True)
        torchvision.datasets.CIFAR10(root=self.data_dir, train=False, download=True)
            
    def setup(self, stage=None):
        if stage == 'fit' or stage is None:
            t = transforms.Compose(self.augment + self.normalize)
            cifar_full = torchvision.datasets.CIFAR10(root=self.data_dir, train=True, transform=t)
            self.df_train, self.df_val = torch.utils.data.random_split(cifar_full, [45000, 5000])
        if stage == 'test' or stage is None:
            t = transforms.Compose(self.normalize)
            self.df_test = torchvision.datasets.CIFAR10(root=self.data_dir, train=False, transform=t)

dm = CIFAR10DataModule(batch_size=128, num_workers=4)
dm.setup("fit")
lr_logger = pl.callbacks.LearningRateMonitor()
mlflow.pytorch.autolog()
model = ResNet()
trainer = pl.Trainer(gpus=1, max_epochs=180, callbacks=[lr_logger])
trainer.fit(model, dm)

However the accuracy only reaches around 86%, well below the 91.25% given in the original paper.

There is a comment in the repository that hosts the ResNet/CIFAR10 model which indicates that this issue seemed to occur after an update of PyTorch from version 1.1 to 1.2:

(note that the reported numbers in the issue refer to ResNet56, but the effect is the same, just less pronounced)

I suspect that I am seeing the same issue and would like to understand what is causing it and how I can best fix it.

etrommer · October 25, 2021, 3:22pm

Note that this is validation accuracy, not test accucary. Now that I am thinking about it, I am wondering whether the drop in accuracy I am seeing is a side effect of the augmentation. I will report a value on the test set tomorrow!

etrommer · October 26, 2021, 6:55am

I have evaluated against the test set and the effect stays the same: Training accuracy reaches around 93%, but test accuracy stagnates at around 85%.

etrommer · October 26, 2021, 8:28am

I have found the issue, and it is a very subtle one:

When returning a scheduler to Lightning using the dict format like I do in this line:

        return {'optimizer': optimizer, 'scheduler': scheduler}

the keyword for the scheduler needs to be lr_scheduler, otherwise it will not be picked up and the learning rate will stay high.

s17861418198 · January 16, 2024, 8:00am

i want use imagerfloder get cifar10，but the accuracy reaches 85%，expecting>96%