Missing hyperparameters for pre-trained ImageNet models (torchvision.models)

gallowaa · April 22, 2019, 6:09pm

I’d like to use use the pre-trained models from https://pytorch.org/docs/stable/torchvision/models.html in a scientific paper. It would be helpful to know how these models were trained.

Is the augmentation scheme the same as in the example? https://github.com/pytorch/examples/blob/42e5b996718797e45c46a25c55b031e6768f8440/imagenet/main.py

It would be helpful if someone can confirm for each model: the optimizer used if not SGD, number of epochs, initial learning rate and schedule if any, weight decay, momentum, and batch size.

Thank you.