Question about registering PyTorch parameters

Niels_PyTorch · September 15, 2020, 8:01am

I am implementing an algorithm by Google (which was written in Tensorflow 1.0) in PyTorch. The Tensorflow implementation defines a classification head on top of a BERT model as follows:

hidden_size = output_layer.shape.as_list()[-1]
  output_weights = tf.get_variable(
      "output_weights", [hidden_size],
      initializer=tf.zeros_initializer()
      if config.init_cell_selection_weights_to_zero else _classification_initializer())
  output_bias = tf.get_variable(
      "output_bias", shape=(), initializer=tf.zeros_initializer())

I want to define the same in PyTorch. I defined it as follows:

class net(nn.Module):

    def __init__(self, config):   

        super().__init__(config)

        # classification head
        if config.init_cell_selection_weights_to_zero: 
            self.output_weights = nn.Parameter(torch.zeros(config.hidden_size)) 
        else:
            self.output_weights = nn.Parameter(torch.empty(config.hidden_size))
            nn.init.normal_(self.output_weights, std=0.02) # here, a truncated normal is used in the original implementation
        self.output_bias = nn.Parameter(torch.zeros([]))

    def forward(self, ...)

In other words, I am using torch.nn.Parameter as a counterpart of tf.get_variable to define this additional trainable layer. This adds the classification head to the parameters of the model (i.e. they are printed when I print list(model.parameters())), but they are not printed when I type print(model).

Is this because they are not registered yet? Do I have to use register_parameter in my init function?

Niels_PyTorch · September 15, 2020, 8:06am

Update: found it on Stackoverflow.