Combining pretrained models for both resnet on imagenet and anything on scenes such as VGG

I am interested in modifying the transfer learning tutorial such that it can learn from the combination of both imagenet and another dataset that is mainly scenes (like places2,places, SUN, etc). For example, can I combine pre-trained resnet50 with pre-trained vgg16 in transfer learning tutorial?

We have access to vgg16 = models.vgg16(pretrained=True) and resnet50 = models.resnet50(pretrained=True) in PyTorch and I know how to use each separately but I don’t know how to combine the pre-trained model.

I actually noticed that vgg16 is also pre-trained on ImageNet. Basically I am looking for a model pretrained for object detection and a model pretrained for scene classification.

You could create a two-stage classifier, where both pre-trained models’ features are fed into a second model, which tries to predict the actual classes.
You could train it end-to-end, which would be the simple way. If you would like to fine-tune both pre-trained models separately and then push the features to the second stage model, you would have to make sure to carefully create separate validation sets, as data leakage will be a problem, if you just stick to the vanilla train/val/test splitting.