Weight initialisation on custom Model

Prakhar_Sharma · August 16, 2022, 12:07pm

I constructed a custom model which looks like this (open the figure in a new tab). It is a combination of multiple classes.

The figure shows a single custom hidden layer (everything in between the input and output).

Finally I created a object using

DGM_model = DGMArch(len(input_tensor), len(output_tensor))

I wanted to implement Glorot normal initialisation (torch.nn.init.xavier_uniform_() ). But the DGM_model.weight is not available.

I can implement this by iterating over each parameter using this article.

How do I implement this inside the __init__() function of the DGMArch class? Here is my weight initialisation code for sequential models.

for i in range(len(layers)-1):
            
            # weights from a normal distribution with 
            # Recommended gain value for tanh = 5/3?
            nn.init.xavier_normal_(self.linears[i].weight.data, gain=1.0)
            
            # set biases to zero
            nn.init.zeros_(self.linears[i].bias.data)

ptrblck · August 16, 2022, 6:07pm

You would have to access the right layers inside the model as e.g.:

torch.nn.init.xavier_uniform_(DGM_model.layer.weight)

where DGM_model.layer is accessing the nn.Module which you can then use to access the internal parameters of this layer via .weight and .bias.

Prakhar_Sharma · August 17, 2022, 5:41pm

I wanted to implement the function within the DGM-model class. This is how I got this done.

for l in range(nr_layers): # creating the DGM arch by appending all DGM layers
    self.layers.append(DGMLayer(d_in, layer_size, activation_fn, weight_norm)) # creates one DGM cell
        
    # difference between nn.Sequential and nn.Module: https://discuss.pytorch.org/t/when-should-i-use-nn-modulelist-and-when-should-i-use-nn-sequential/5463/9
    # basically, nn.Sequential is more tailored.
        self.layers = nn.Sequential(*self.layers) # for the last layer  # pass layers to nn.Sequential
        
    # Initialising custom weights
    if isinstance(self, nn.Linear): # for each weights matrix
        torch.nn.init.xavier_uniform_(self.weight)