Combining 2 Neural Networks

I have 2 images as input, x1 and x2 and try to use convolution as a similarity measure. The idea is that the learned weights substitute more traditional measure of similarity (cross correlation, NN, …). Defining my forward function as follows:

def forward(self,x1,x2):
    out_conv1a = self.conv1(x1)
    out_conv2a = self.conv2(out_conv1a)
    out_conv3a = self.conv3(out_conv2a)

    out_conv1b = self.conv1(x2)
    out_conv2b = self.conv2(out_conv1b)
    out_conv3b = self.conv3(out_conv2b)

Now for the similarity measure:

out_cat = torch.cat([out_conv3a, out_conv3b],dim=1)
futher_conv = nn.Conv2d(out_cat)

My question is as follows:

  1. Would Depthwise/Separable Convolutions as in the google paper yield any advantage over 2d convolution of the concatenated input.

  2. It is my understanding that the groups=2 option in conv2d would provide 2 separate inputs to train weights with, in this case each of the previous networks weights. How are these combined afterwards?
    For a basic concept see here.