Why give a model multiple ReLU attributes?

msailor · December 5, 2022, 3:41pm

When reading through people’s PyTorch code, I often notice people giving their models multiple ReLU attributes, e.g., like so:

class DummyModel(nn.Module):
    def __init__(self, hidden_size):
        super().__init__()
        self.fc1 = nn.Linear(hidden_size, hidden_size)
        self.relu1 = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, hidden_size)
        self.relu2 = nn.ReLU()

    def forward(self, x):
        x = self.fc1(x)
        x = self.relu1(x)
        x = self.fc2(x)
        x = self.relu2(x)
        return x

I’m just curious whether there is any advantage to doing this, as opposed to

class DummyModel2(nn.Module):
    def __init__(self, hidden_size):
        super().__init__()
        self.fc1 = nn.Linear(hidden_size, hidden_size)
        self.fc2 = nn.Linear(hidden_size, hidden_size)
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        x = self.relu(x)
        return x

Since ReLU has no parameters, it seems to me that these will have exactly the same result. So I’m curious if there is some reason to give a model more than one ReLU attribute? Is it just a way of mentally organizing the flow of their code?

ptrblck · December 5, 2022, 5:23pm

I guess models developers use separate modules as it would allow users to replace each activation function separately and would thus allow to run more flexible experiments.