Using same dropout object for multiple drop-out layers?

Hi, can I use the same dropout object for multiple drop-out layers? And the same ReLU object? Or do these need to be created individually for each separate use in a layer?

e.g. this:

class Model(nn.Module):
  def __init__(self):
    super().__init__()
    self.dropout = nn.Dropout(0.5)
    self.relu = nn.ReLU()
    self.lin1 = nn.Linear(4096, 4096)
    self.lin2 = nn.Linear(4096, 4096)
    self.lin3 = nn.Linear(4096, 4096)
  def forward(self, x):
    output = self.dropout(self.relu(self.lin1(x)))
    output = self.dropout(self.relu(self.lin2(output)))
    output = self.dropout(self.relu(self.lin3(output)))
    ....

vs:

class Model(nn.Module):
  def __init__(self):
    super().__init__()
    self.dropout1 = nn.Dropout(0.5)
    self.dropout2 = nn.Dropout(0.5)
    self.dropout3 = nn.Dropout(0.5)
    self.relu1 = nn.ReLU()
    self.relu2 = nn.ReLU()
    self.relu3 = nn.ReLU()
    self.lin1 = nn.Linear(4096, 4096)
    self.lin2 = nn.Linear(4096, 4096)
    self.lin3 = nn.Linear(4096, 4096)
  def forward(self, x):
    output = self.dropout1(self.relu1(self.lin1(x)))
    output = self.dropout2(self.relu2(self.lin2(output)))
    output = self.dropout3(self.relu3(self.lin3(output)))
    ....

I am pretty sure it wouldn’t make a different for the ReLU, but I am not so convinced about the dropout?

5 Likes

You can definitely use the same ReLU activation, since it doesn’t have a specific state.

For dropout, I understand why it could not work, but the nn.Dropout module itself calls the functional API F.dropout at each forward call, so it would seem that each call randomizes the dropped weights, regardless of whether it’s several modules or just the one! See the source code here.

10 Likes

Hi, have you figured this out? I am having the same question.

Have a look here.

I actually have seen your reply but not sure if that is what the question asked. But thanks for following up.

Excuse me for jumping into the conversation, but so this was one of the bits of style advice in a tutorial I did earlier this week.

While it would technically work for vanilla PyTorch use, I would consider it bad advice to re-use layers. This includes ReLU and Dropout.

My style advice is to use the functional interface when you don’t want state, and instantiate an one object per use-case for if you do.

The reason for this is that it causes more confusion than benefits.

  • It looks funny in printing.
  • It may screw up other analysis tools.
  • When you do advanced things, e.g. quantization, suddenly ReLU is stateful because it captures data for quantization.

Best regards

Thomas

12 Likes

Thank you Thomas, for your answer, I think that is what I was looking for. Also, I usually directly call ReLUs in forward without declaring them in init but wasn’t sure if that is the best practice.

From my personal view,
Using different dropouts would be better.
Because using same dropout could cause to deactivate same connected neurons in different layers.
Different dropouts would randomly deactivate mitigating overfitting.

1 Like

Did you try this, though?

I would consider it is a red herring as calling into the same dropout module will still give different random samples.
Not re-using a single module is really a matter of cleanliness in expressing the structure as code.

Best regards

Thomas

2 Likes

After some test, different F.dropout2d calls seem to deactivate different part of the tensor.

2 Likes

Thx for you explanation, as I have understood during the inference of quantised model your ReLU works differently as if it was training (it learns params that fix quantization error during training). But it is not that clear about analysis tools, could you pls give an example of some case or tool?