Only copy architecture

vainaijr · September 3, 2019, 2:43pm

I have 3 neural networks, A, B, C.
A and B have different architecture, but I want C to have same architecture as B, but different weights, bias initialization, and its parameters to be updated differently.
If I do
C = B
then it would mean both are same neural network with parameters getting updated in same way.
how do I ensure that both have different parameters but same architecture?

ptrblck · September 3, 2019, 3:12pm

You could initialize B using the same model:

A = MyModel()
...
B = MyModel()

vainaijr · September 3, 2019, 3:39pm

import torchvision, torch.nn as nn, torch, torchsummary

class CombineBase(nn.Module):
  def __init__(self, ModelOne, ModelTwo):
    super().__init__()
    self.modelone = ModelOne.to('cuda')
    self.modeltwo = ModelTwo.to('cuda')
    self.lin_one = nn.Linear(2000, 1000)
    self.lin_two = nn.Linear(1000, 10)
    self.softmax = nn.Softmax(dim=-1)
  def forward(self, x):
    out = torch.cat((self.modelone(x), self.modeltwo(x)), dim=-1)
    out = self.softmax(self.lin_two(self.lin_one(out)))
    return out

class CombineMiddle(nn.Module):
  def __init__(self, ModelOne, ModelTwo):
    super().__init__()
    self.modelone = ModelOne.to('cuda')
    self.modeltwo = ModelTwo.to('cuda')
    self.lin = nn.Linear(20, 10)
    self.softmax = nn.Softmax(dim=-1)
  def forward(self, x):
    out = torch.cat((self.modelone(x), self.modeltwo(x)), dim=-1)
    out = self.softmax(self.lin(out))
    return out

level_one = [torchvision.models.alexnet(pretrained=False) for i in range(16)]

def one_level(combination_type, number, level):
    return [combination_type(level[i], level[i+1]) for i in range(number) if i%2==0]

level_two = one_level(CombineBase, 16, level_one)
level_three = one_level(CombineMiddle, 8, level_two)
level_four = one_level(CombineMiddle, 4, level_three)
top_level = CombineMiddle(level_four[0], level_four[1])

torchsummary.summary(top_level, (3, 128, 128), batch_size=100)

would this ensure weights of each neural network get updated differently.

also it gives some error when using cuda, expected backend CPU got CUDA, for #4 ‘mat1’

ptrblck · September 3, 2019, 4:35pm

In this code snippet you will initialize 16 different alexnets, which are shared in the different levels.
E.g. you could check the id of some parameter and check, that it’s the same:

print(id(level_one[0].classifier[1].weight))
> 140500209576624
print(id(level_two[0].modelone.classifier[1].weight))
> 140500209576624

The device error might come from the usage of to('cuda') inside the __init__.
Usually you could call to(device) on the complete model so make sure all parameters are pushed to the device.
If you are not using it in this way, self.lin might still be on the CPU.

vainaijr · September 3, 2019, 5:48pm

it gives 10% accuracy on cifar10, I resized images to (128, 128).
changing it to combination of two neural networks with convnet, 4 layered gives 70% accuracy.