A quick question. Since I have two pre-trained models (let’s say modelA & modelB), and I want to create modelC using the pre-trained models and also want to use all three of them for final predictions.
Do I make two copies of modelA (let’s say - modelA_1, modelA_2) and modelB (modelB_1, modelB_2). Keep first copies as they are and take the second copies (modelA_2, modelB_2) remove the classification heads and freeze all other layers, then concat the features and then add a classification head?
The short form of my question is, do I make two copies of each pre-trained model?
My goal is to have three predictions in total from modelA_1, modelB_1, & modelC. I hope I’m making sense. Thanks!
You could create copies, but this will of course increase the memory usage and you would need to execute each model separately, so it’s quite wasteful.
The better approach would be to use e.g. forward hook to get the desired forward activation from the original model during a single forward pass, and to pass this activation to the new classifier heads.