Transfer learning in parallel NNs

Mari · March 7, 2022, 6:32pm

I am doing parallel NN in PINN and I need to use transfer learning but I do not know how I can use it? Can someone please guide me on how to use different pre-trained networks in parallel NN?

``

class Net2_kc(nn.Module): 

    def __init__(self):
        super(Net2_kc, self).__init__()

         
        self.k3 = nn.Sequential(
            nn.Linear(input_n,hidden_dim),
            Swish(),
            # nn.ReLU(),
            nn.Linear(hidden_dim,hidden_dim),
            Swish(),
            # nn.ReLU(),
            nn.Linear(hidden_dim,hidden_dim),
            Swish(),
            # nn.ReLU(),
            nn.Linear(hidden_dim,hidden_dim),
            Swish(),
            # nn.ReLU(),
            nn.Linear(hidden_dim,1),
     
        )


        self.k4 = nn.Sequential(
            nn.Linear(input_n,hidden_dim),
            Swish(),
            # nn.ReLU(),
            nn.Linear(hidden_dim,hidden_dim),
            Swish(),
            # nn.ReLU(),             
            nn.Linear(hidden_dim,hidden_dim),
            Swish(),
            # nn.ReLU(),
            nn.Linear(hidden_dim,hidden_dim),
            Swish(),
            # nn.ReLU(),
            nn.Linear(hidden_dim,1),
     
        )


        self.k5 = nn.Sequential(
            nn.Linear(input_n,hidden_dim),
            Swish(),
            # nn.ReLU(),
            nn.Linear(hidden_dim,hidden_dim),
            Swish(),
            # nn.ReLU(),             
            nn.Linear(hidden_dim,hidden_dim),
            Swish(),
            # nn.ReLU(),
            nn.Linear(hidden_dim,hidden_dim),
            Swish(),
            # nn.ReLU(),
            nn.Linear(hidden_dim,1),
     
        )
        
        
        self.k6 = nn.Sequential(
            nn.Linear(input_n,hidden_dim),
            Swish(),
            # nn.ReLU(),
            nn.Linear(hidden_dim,hidden_dim),
            Swish(),
            # nn.ReLU(),             
            nn.Linear(hidden_dim,hidden_dim),
            Swish(),
            # nn.ReLU(),
            nn.Linear(hidden_dim,hidden_dim),
            Swish(),
            # nn.ReLU(),
            nn.Linear(hidden_dim,1),
     
        )
        
        self.k7 = nn.Sequential(
            nn.Linear(input_n,hidden_dim),
            Swish(),
            # nn.ReLU(),
            nn.Linear(hidden_dim,hidden_dim),
            Swish(),
            # nn.ReLU(),             
            nn.Linear(hidden_dim,hidden_dim),
            Swish(),
            # nn.ReLU(),
            nn.Linear(hidden_dim,hidden_dim),
            Swish(),
            # nn.ReLU(),
            nn.Linear(hidden_dim,1),
     
        )


       
        self.final = nn.Sequential(
            #nn.Linear(2,1),  #Doing just 2,1 gives very bad results!!!!!!!!!
            nn.Linear(2,50),
            nn.Linear(50,1),

        )



   
    def forward(self,x):  
        k0 = self.k3(x)
        k1 = self.k4(x)
        k2 = self.k5(x)
        k3 = self.k6(x)
        k4 = self.k5(x)

        output = torch.cat(k0,k1,k2,k3,k4), dim=1 )
        output2 = self.final(output)
        
        return  output2

``

How can i use pretrained network for k3,k4,k5,k6, and k7 separately?

Mari · March 7, 2022, 10:55pm

Do i need to do s.t. like this:

net2_kc.k3.load_state_dict(torch.load(path))
net2_kc.k4.load_state_dict(torch.load(path))

ptrblck · March 8, 2022, 12:38am

Yes, you could index the internal submodules and load the pre-trained state_dicts.

Mari · March 8, 2022, 12:41am

Can you please give me an example?

ptrblck · March 8, 2022, 12:45am

Your posted code already would be a minimal example assuming you have stored the pre-trained state_dicts for each submodule before.

Mari · March 8, 2022, 12:55am

net2_kc.k3.load_state_dict(torch.load(path))
net2_kc.k4.load_state_dict(torch.load(path))

This does not work. Giving below error.

RuntimeError: Error(s) in loading state_dict for Sequential: Missing key(s) in state_dict: “0.weight”, “0.bias”, “2.weight”, “2.bias”, “4.weight”, “4.bias”, “6.weight”, “6.bias”, “8.weight”, “8.bias”. Unexpected key(s) in state_dict: “main.0.weight”, “main.0.bias”, “main.2.weight”, “main.2.bias”, “main.4.weight”, “main.4.bias”, “main.6.weight”, “main.6.bias”, “main.8.weight”, “main.8.bias”.

Mari · March 8, 2022, 1:07am

ptrblk,

What is this error for?

ptrblck · March 8, 2022, 1:57am

The error is raised since your current k3 or k4 modules use a different structure (i.e. they are nn.Sequential containers) while the state_dict contains a main module, which seems to use an nn.Sequential container internally.
You could either change the state_dict keys to match the newly create submodule or change the module to use the main attribute.

Mari · March 8, 2022, 2:23am

The pre-trained network size maches the submodule.
The pre-trained network k3 is like this:
``
class k3(nn.Module):

    def __init__(self):
        super(k3, self).__init__()
        self.main = nn.Sequential(
            nn.Linear(input_n,h_nk),
            
            Swish(),
            nn.Linear(h_nk,h_nk),
            
            Swish(),
           
            nn.Linear(h_nk,h_nk),
            
            Swish(),
           
            nn.Linear(h_nk,h_nk),
            
            Swish(),
            

            nn.Linear(h_nk,1),
        )
    
    def forward(self,x):
        output = self.main(x)
        return  output

``
here hn_k is equal to hidden_dim.
I have the same NN for k4…k7. For each network, results are saved seperately and are used in state_dicts.

ptrblck · March 8, 2022, 2:25am

As you can see in the posted code snippet, the difference is that your new model uses:

self.k3 = nn.Sequential(...

while the old module was defined as a custom nn.Module k3 with a self.main attribute containing the nn.Sequential.
If you want to directly load a state_dict, make sure the architectures match exactly.
Otherwise you would have to manipulate e.g. the state_dict keys.

Mari · March 8, 2022, 2:32am

Right. But here if i want to train all pre-trained networks seperately how can i dfine the pre-trained nn to match the submodule?

ptrblck · March 8, 2022, 2:32am

In Net2_kc use self.k3 = k3(...).

Mari · March 8, 2022, 2:42am

Sorry, i did not understand. How k3 is defined in Net2_kc, self.k3 = k3(...) ?

ptrblck · March 8, 2022, 8:15am

Here is a minimal code snippet showing how to load the previously stored k3 state_dict:

class k3(nn.Module):
    def __init__(self):
        super(k3, self).__init__()
        self.main = nn.Sequential(
            nn.Linear(10, 10)
        )
    
    def forward(self,x):
        output = self.main(x)
        return  output


class Net2_kc(nn.Module): 
    def __init__(self):
        super(Net2_kc, self).__init__()
        self.k3 = k3()

    def forward(self,x):  
        k0 = self.k3(x)        
        return  k0

# store state_dict from original k3 layer
model = k3()
sd = model.state_dict()

# load into new model
model = Net2_kc()
model.k3.load_state_dict(sd)
# > <All keys matched successfully>

Mari · March 8, 2022, 10:17pm

Thanks ptrblck,

The reason why i was confused about k3() it was that i did not define class K3 in my current code. By defining it, it is working now.

Thanks again for your help.