How to tweak merging of 2 network parameters with different inputs?

I have 2 trained models, which has different number of channels in inputs and I would like to merge these network parameters. e.g.

Model1 =>Conv layer parameters [N1,C1,H,W]
Model2 =>Conv layer parameters [N2,C2,H,W]

Final Model => [N1+N2,C1+C2,H,W]

As will concatenate only along single axis. Any suggestion to merge and copy conv layer parameters to a separate network.


This won’t work… N1xC1xHxW + N2xC2xHxW < (N1+N2)x(C1+C2)xHxW in most cases.

Thanks Simon for clarifying.

I am trying to implement Split brain AutoEncoder paper for understanding and it needs merging. I am not sure I misinterpret the paper, but as per knowledge the author is trying to implement as follows:

L_channel (Input) => Network 1 => ab_channel (output)
ab_channel (Input) => Network 2 => L_channel (output)

finally merge both networks along channels for each layers to test different Classification/Detection network comparisons. e.g. Classification network
RGB => (Network 1 + Network 2) => fc5 features => classifier

Now I am not sure how to merge them, if I try to use ‘group’ parameter in convolution to split the network then how do we split 3 channel input so that 1 part of network looks “L” channel and other “ab” channel.

It might need some time to read the paper t answer my query. This is explained in section 3.1/3.2 of paper.

Thanks in advance.

From your description, it seems that you need to transform rgb images into lab space, and activate on each subset of channels separately using the two networks.

yes, we can do that also. But main intention is to use weights learn (feature learning) by spliting the network along channels for each layer and merge both for other tasks and use single network for it

RGB => Lab => (Network 1 + Network 2) => fc5 features => classifier

Why can’t you just activate each net to fc and concatenate the results? What’s the importance of having each layer outputs concatenated?

I tried but it for my dataset and network, but transfer learning works poorer as compare to random initalization of weights. So I thought, combining the nets along channel and performing convolution on combined will give more valuable features, I am also not sure of it …So I wanted to try this combining the network and training it for classification.