Dynamic heads in a model - Idea. Intuition and questions

I am experimenting with EfficientNet for a regression task(3 dependent variables). The train dataset has images and some metadata that I would like to use , however, the test set will only have image. So if I were to have a model which will take image and metadata as inputs and since test cases will not have metadata , i will have to have a model(s) to predict metadata for the given image and then use it in my final model to predict targets. Zooming into metadata, things get bit more interesting. Lets say we have meta data A,B,C. So we are looking at 3 models which will have image as the constant input and varying length of metadata..eg, to predict A, image is sufficient, to predict B , image and A is required..sort of heirarchical. so metadata length of each sub-models are different.. and different loss functions, activations are different as some meta is regression,categorical, one-hot etc. etc..

So its all about the classifier head of the efficientnet.

Of course, I can create 4 different models (1 for final regression and 3 for sub-models for meta).

However, I am experimenting with a single model with dynamically changing classifier heads.I am using LazyLinear(that solves the input length issue) so that input length doesn’t have to be specified.

So I have a json structure with the required details of models (loss fn,final activation etc..) and dataset is prepared accordinglyand returns image,metadata(length depends of the model) and targets..

The whole objective is to have a single model with dynamically changing heads and infusing dynamically changing metadata of different lengths..sort of declarative and just in time model..(may be like a human brain…)

My questions are :slight_smile:

  1. Is this approach practical ?
  2. what happens when i change the classification head(only metadata) in eval mode ? To ensure, everything as per the training after all batches of an epoch I save the ‘state’ and in eval mode after changing the classifier head(my intuition is when its changed, it will get initialized with random weights) with ‘eval’ metadata , I reload previously saved ‘state’ and do eval predictions(due to the lack of my knowledge to access weights and bias of the classifier head, which is a dynamically created nn.Sequential) . Is this ok ? is there a better way to do this ?

hope i was clear in translating my task..expert comments much appreciated, else I adopt the conventional way of multi-modal approach and continue with life. Thank you in advance.