How to add a layer to an existing Neural Network?

Fractale · November 21, 2018, 2:01am

actually I use:
torch.nn.Sequential(model, torch.nn.Softmax())
but It create a new sequence with my model has a first element and the sofmax after. It’s not adding the sofmax to the model sequence.
I know these 2 networks will be equivalenet but I feel it’s not really the correct way to do that.

ptrblck · November 21, 2018, 11:54am

It should generally work.
Here is a small example:

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.fc = nn.Linear(10, 2)
        
    def forward(self, x):
        x = self.fc(x)
        return x


model = MyModel()
x = torch.randn(1, 10)
print(model(x))
> tensor([[-0.2403,  0.8158]], grad_fn=<ThAddmmBackward>)

model = nn.Sequential(
    model,
    nn.Softmax(1)
)
print(model(x))
> tensor([[0.2581, 0.7419]], grad_fn=<SoftmaxBackward>)

As you can see, the output was normalized using softmax in the second call.
Also the grad_fn points to softmax.
Could you print your model after adding the softmax layer to it?

Fractale · November 22, 2018, 3:06am

Sorry I was probably not clear.
This is how I create my model.
But when I print my model, it’s a model inside a model, inside a model, inside a model, not a list of layers.
Is there a better way to do that?

def getMultiLayerPerceptron(InputNetworkSize, NbClass, nbHidenLayer,
                                   hidden_dimension_size, activationFunction):
    model = torch.nn.Sequential(
        torch.nn.Linear(InputNetworkSize, hidden_dimension_size),
        activationFunction())
    for i in range(nbHidenLayer - 1):
        model = torch.nn.Sequential(
            model,
            torch.nn.Linear(hidden_dimension_size, hidden_dimension_size),
            activationFunction())

    model = torch.nn.Sequential(
        model, torch.nn.Linear(hidden_dimension_size, NbClass))
    return model

A use torch.nn.Sequential because I don’t understand what should I put in the __init__ and what should I put in the forward function when using a class for a multi-layer fully connected neural network

ptrblck · November 22, 2018, 8:09am

Well, you could also define these layers inside the __init__ of another module.
Here is an example using nn.ModuleList:

class MyModel(nn.Module):
    def __init__(self, in_features, nb_classes, nb_hidden_layer, 
        hidden_size, act=nn.ReLU):
        super(MyModel, self).__init__()
        self.act = act()
        
        self.fc1 = nn.Linear(in_features, hidden_size)
        self.fcs = nn.ModuleList([nn.Linear(hidden_size, hidden_size)])
        self.out = nn.Linear(hidden_size, nb_classes)
        
    def forward(self, x):
        x = self.act(self.fc1(x))
        for l in self.fcs:
            x = F.relu(l(x))
        x = self.out(x)
        return x
            
model = MyModel(2, 3, 4, 5, nn.ReLU)

You could also use nn.ModuleDict to set the layer names.

pulpaul · April 22, 2020, 7:00pm

And how do you add a Fully Connected layer to a Pretrained ResNet50 Network?

ptrblck · April 23, 2020, 2:56am

I assume you would like to add the new linear layer at the end of the model?
If so, resnet50 uses the .fc attribute to store the last linear layer:

model = models.resnet50()
print(model.fc)
> Linear(in_features=2048, out_features=1000, bias=True)

You could store this layer and add a new nn.Sequential container as the .fc attribute via:

lin = model.fc
new_lin = nn.Sequential(
    nn.Linear(lin.in_features, lin.in_features),
    nn.ReLU(),
    lin
)

model.fc = new_lin

pulpaul · April 23, 2020, 5:20pm

And Do I need to modify the forward function on the model class? Thanks

ptrblck · April 23, 2020, 7:56pm

If you replace an already registered module (e.g. model.fc), you would have to make sure that the setup (expected input and output shapes) are valid. Other than that, you wouldn’t need to change the forward method and this module will still be called as in the original forward.

However, if you need to add changes, which aren’t a simple replacement of layers, I would recommend to manipulate the forward method.

gyt.971027 · August 2, 2021, 8:37pm

What should I do to add quant and dequant layer in a pre-trained model?

ptrblck · August 8, 2021, 6:01am

The BERT quantization tutorial seems to load a pr-trained model and apply dynamic quantization to it, so it could be helpful.

HoaAnDuong · September 21, 2022, 4:51pm

If this discuss page have an upvote system, i will give a upvote for u