Training time when using Dropout with p=0.0

I want to keep my model configurable including setting the Dropout probability to different values. The easiest way would be to create all my Dropout layers something like

self __init__(self, ..., drobout_prob=0.0, ...):
    ...
    self.dropout1 = nn.Dropout(p=dropout_prob)
    ...

self forward(X):
    ...
    out = self.dropout1(out)
    ...

I have no doubt this works quite fine. I only wonder if I sacrifice any noteworthy performance if my dropout probability is 0.0, making all Dropout layers essentially identity functions. In principle, I could do something like

self forward(X):
    ...
    if self.dropout_prob is not None and self.dropout_prob > 0.0:
        out = self.dropout1(out)
    ...

Would this have any measurable advantage in practice?

Internally, dropout with p=0 behaves as identity:

so there should be need for the if statement.

1 Like