I’d like to use use the pre-trained models from https://pytorch.org/docs/stable/torchvision/models.html in a scientific paper. It would be helpful to know how these models were trained.
Is the augmentation scheme the same as in the example? https://github.com/pytorch/examples/blob/42e5b996718797e45c46a25c55b031e6768f8440/imagenet/main.py
It would be helpful if someone can confirm for each model: the optimizer used if not SGD, number of epochs, initial learning rate and schedule if any, weight decay, momentum, and batch size.
Thank you.