PyTorch models pretraining source


Regarding to pytorch models such as ResNet and ViT, are they pre-trained only on ImageNet-1K (so the default weights only comes from that dataset) or they are pre-trained on ImageNet-21K and then finetuned on ImageNet-1K? I’m asking because timm’s library has, for example, the checkpoint of ViT/B-16 with weights that comes from pre-training on the 21K version and then finetuned on the 1K one, so I wanted to know if that is the case of the models that come from pytorch hub.