Dynamically add or delete layers

SKYHOWIE25 · November 27, 2017, 5:54am

Hi

How to dynamically add or delete layers during training? or how to modify the network architecture after each epoch?

Many thanks.

SimonW · November 27, 2017, 6:34am

You can do define a method on your network class that updates the architecture, and call it at the end of every epoch. The method should do something that changes how .forward behaves.

mortezamg63 · November 27, 2017, 8:20am

I think that my page in github can guide you to achieve your goal.

I hope it can be helpful

Akis_Linardos · January 16, 2018, 4:25pm

Did you manage to do it?
I had a read through morztezamg63’s github but this is about modifying layers that already exist in the class.
How would you add new layers dynamically without changing the class core code?

smth · January 18, 2018, 3:41am

@Akis_Linardos you cannot add new layers dynamically without changing the code. in pytorch code = model

Akis_Linardos · January 18, 2018, 11:18am

Is there no way to do it with an OrderedDict class? If we have the constructor iterate over a number that is given as an argument and append layers to the ordereddict class, it seems to work without changing the class code. So the first time we may train a network with 3 layers and the second time train a network with 4 layers. Here’s my piece of code:

def init(self, blocklist, num_classes=1, stride = [1,2,2], block=BasicBlock):
"""
_make_block creates a residual block like those used in ResNet.
LastUpdatedOrderedDict is a subclass of OrderedDict that stores items in the order the keys were last added
"""

    #blocklist specifies the number of layers in each block

    self.inplanes = 64
    super(ModNet, self).__init__()

    #This structure has the advantage of being able to set the number of layers as a hyperparameter. The hyperparameters in this case are lists of same length which equals the number of blocks we want to use and contain how many layers each block has and the stride movement
    layers = LastUpdatedOrderedDict([
            ('conv1', nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
                               bias=False)),
            ('bn1', nn.BatchNorm2d(64)),
            ('relu1', nn.ReLU(inplace=True)),
            ('maxpool', nn.MaxPool2d(kernel_size=3, stride=2, padding=1)),
            ("layer1", self._make_block(block, 64, blocklist[0], stride=stride[0]))
            ])
    p = 128
    for i in range(1,len(blocklist)):
        layer = self._make_block(block, p, blocklist[i], stride=stride[i])
        layers.__setitem__("layer{}".format(i+1), layer) #__setitem__ appends a new layer to the dict
        p = 2*p

    self.stacked_layers = nn.Sequential(layers)
    self.avgpool = nn.AvgPool2d(14, stride=1)
    self.fc = nn.Linear(256 * block.expansion, num_classes)

However my issue is. I want to load a trained model and then add 1 layer to it and do a sort of transfer learning that way. Is this not possible in pytorch?

karmus89 · February 28, 2018, 10:55am

Hi @Akis_Linardos!

This should be possible. This following snippet is a good example:

from torch import nn
from collections import OrderedDict

class Net(nn.Module):
    
    def __init__(self, n_layers):
        
        super().__init__()
        
        layers = OrderedDict()
        for i in range(n_layers):
            layers[str(i)] = nn.Linear(5,5)
            
        self.layers = nn.Sequential(layers)
        print(self)
        
Net(n_layers=3)

The output will then be

Net(
  (layers): Sequential(
    (0): Linear(in_features=5, out_features=5)
    (1): Linear(in_features=5, out_features=5)
    (2): Linear(in_features=5, out_features=5)
  )
)

While this is just a crude example, you are able to add complexity to this with more logic.

LeErnst · May 9, 2020, 11:41am

Hey,
i have solved it in my ReLU NN in the following way:

class ReLU_NN(tr.nn.Module):
    '''
    Class for a ReLU-NN with variable size.
    Input:
        -nn_list = [input dim., first hidden layer size,...,
                    last hidden layer size, output dim.]

    '''
    def __init__(self, nn_list):
        super(ReLU_NN, self).__init__()
        self.nn_list = nn_list
        self.hidden  = tr.nn.ModuleList()
        for i in range(len(nn_list)-1):
            self.hidden.append(tr.nn.Linear(nn_list[i], nn_list[i+1]).double())

    def forward(self, x):
        # forward pass through the network
        for layer in self.hidden[:-1]:
            x = tr.nn.functional.relu(layer(x))

        # the last layer is a linear layer without activation function
        output = self.hidden[-1](x)

        return output

    def add_layers(self, nn_add_list):
        '''
        Adds some layer between the last hidden layer and the output layer
        Input:
            -nn_add_list: a list which length defines the number of layer to be
                          added and each entry defines the number of neurons
                          e.g. [num1,...,numL]
                          Then self.hidden and self.nn_list will be adjusted
                          in the following way:
                          self.hidden = [old layer,...,old layer,new layer 1,...,
                                         new layer L, new output layer]
                          self.nn_list = [input-dim, old-layer-size,...,
                                          old-layer-size, num1,...,numL, output-
                                          dim]
        '''
        # adjust nn_list
        length = len(nn_add_list)
        temp   = self.nn_list[-1]
        self.nn_list.extend(nn_add_list)
        for i in range(length, 0, -1):
            self.nn_list[-i-1] = self.nn_list[-i]
        self.nn_list[-1] = temp

        # adjust hidden
        for i in range(length+2, 2, -1):
            self.hidden.insert(len(self.hidden)-1,\
                    tr.nn.Linear(self.nn_list[-i], self.nn_list[-i+1]).double())
        
        # adjust last hidden layer
        self.hidden[-1] = tr.nn.Linear(self.nn_list[-2], self.nn_list[-1]).double()

If have also implemented a method for adding neurons to the layers, but this is a little bit trickier. If someone wants it i can show the code snipped here. But the real problem is, that you have to change dynamically the parameter_groups within the optimizer and this is, as far as i know not possible. All you can do is to redefine the optimizer with the new model.parameters() (of course with the preoptimized weights and biases) and restart the optimization prozess.

dc_wang · January 27, 2022, 4:48pm

Thank you for your solution here, could you please show the code of adding neurons to an existed hidden layer?