Add multiple FC layers in parallel

As per the paper which I am implementing, I have to freeze layers when I am changing the dataset and not epoch wise

Sure you have. This is what I implemented. In my implementation you switch only one time (after 50 epochs). But you have to check each epoch if you have to switch or not!

1.) Targets are the labels you want the network to learn. Your dataset should return the image batch and the labels in it’s __getitem__ method.

2.) Your test module should be similar to your train module except the fact that you don’t do any optimization there (no loss.backward() and optimizer steps). You should evaluate both FC layers on their corresponding dataset using the corresponding loss-function/calculate the accuracy.

After re-writing everything, my model worked as i had hoped. Thanks to you! Just wanted to ask what will i do when I’ll be turning on both FC layers. What will be the output_fc in that case ?

The output_fc depends on the dataset you’re using. if you want to use data which is similar to the dataset the first fc has been trained with you should use the first fc for prediction and vice versa

I get your point, but as per the paper, joint training has to be performed, so as per you first_fc would be used for calculating loss, but what about predictions from second fc ?

I cannot answer this question. Thsi strongly depends on your implementation and the theories in the paper (which I have currently no time to look up). You might do some prediction with both layers (but each on their own dataset) and add the losses for backpropagation, but this is only a wild guess.

You can’t. This error occurs of you wanted to load the weights into a layer which has a number of outputs which is different from the number of outputs the net has been trained with. Therefore, the dimensions of weights and bias do not match with the needed shapes.

You need to create a new instance of your model class and afterwords load it with

load_state_dict(torch.load(SAVED_MODEL_PATH)["state_dict"])

Thanks for the code! However, I have few questions.

  1. Dont we have to go through epochs in a cyclic manner as once you train fc1, there will be changes required in fc2 as all the common weights will also change?

  2. At what point in time, do you train base model?

In my case, I have a common LSTM model and have 3 parallel fc layers which I have enabled training after every 20 epochs and each layer go through its turn 3-4 times like 1-2-3-1-2-3… and so on.

So I am assuming here that LSTM layer is enabled for training by default. Am I correct?