I am trying an approach to solve a specific problem that requires me to add n neurons to the last layer of a pre-trained model. The tricky part is that it has to be on the same layer. I found an example in the forums here: older question
I want to keep the pre-trained weights intact and add n randomly initialized ones.
In said question, they gave a code snippet that I modified to my problem. It looks like this:
def add_units(self, n_new):
'''
n_new : integer variable counting the neurons you want to add
'''
# take a copy of the current weights stored in self._fc which is an
# ModuleList variable with only one layer
current = [ix.weight.data for ix in self._fc]
# randomly initialize a tensor with the size of the wanted layer
hl_input = torch.zeros([n_new, current[0].shape[1]])
nn.init.xavier_uniform_(hl_input, gain=nn.init.calculate_gain('relu'))
# concatenate the old weights with the new weights
new_wi = torch.cat([current[0], hl_input], dim=0)
# reset weight and grad variables to new size
self._fc[0] = nn.Linear(current[0].shape[1]+n_new, 2)
# set the weight data to new values
self._fc[0].weight.data = torch.tensor(new_wi, requires_grad=True)
This method is inside the model class and can be called by typing “mode.add_units(N)”.
Am I preserving the original waits of the layer “_fc” and adding new, randomly initialize ones or am I missing something?
Running a quick code, adding just two nodes seems to print the right shape:
[In] print(model._fc)
[Out] Linear(in_features=2560, out_features=2)
[In] model.add_units(2)
[In] print(model._fc)
[Out] Linear(in_features=2562, out_features=2)
I am not sure if by doing this I am preserving the weights and the bias, I assumed that the last line did so but could not know for certain.