Model performance changes when having unused components in model init method

Hi,

I am observing the following weird behaviour with my model.

class Model(nn.Module):
    def __init__(self, encoder, classifier):
        super(Model, self).__init__()
        self.gru = torch.nn.GRU(input_size=100, hidden_size=300, dropout=0.5, num_layers=3, bidirectional=True)
        self.dp1 = torch.nn.Dropout(0.7)
        self.dense1 = TimeDistributed(torch.nn.Linear(600, 100))
        self.encoder = encoder
        self.classifier = classifier
    def forward(self, x, m):
        #import pdb
        #pdb.set_trace()
        #x, h = self.gru(x)
        #x = self.dp1(x)
        #x = self.dense1(x)
        x = self.encoder(x, m)
        x = self.classifier(x)
        return x

When I compile and run this model, I get a binary classification accuracy of 60.77%.

But when I comment the components in both forward and init, it decreases slightly to 59.6%.

class Model(nn.Module):
    def __init__(self, encoder, classifier):
        super(Model, self).__init__()
        #self.gru = torch.nn.GRU(input_size=100, hidden_size=300, dropout=0.5, num_layers=3, bidirectional=True)
        #self.dp1 = torch.nn.Dropout(0.7)
        #self.dense1 = TimeDistributed(torch.nn.Linear(600, 100))
        self.encoder = encoder
        self.classifier = classifier
    def forward(self, x, m):
        #import pdb
        #pdb.set_trace()
        #x, h = self.gru(x)
        #x = self.dp1(x)
        #x = self.dense1(x)
        x = self.encoder(x, m)
        x = self.classifier(x)
        return x

So, why does the performance differ in the 2 cases?

Note: I have kept the random seed constant and enabled the CUDA deterministic behaviour. So, each run with no model changes provides the same accuracy.

Thanks,

1 Like

Each layer creation will randomly initialize the parameters (if available) and will thus call into the pseudorandom number generator. If your training is sensitive to different seeds, these additional layer creations can change the training results significantly and you should see the same effect by removing the layer creation and just changing the seed at the beginning of your script.