You can definitely use the same ReLU activation, since it doesn’t have a specific state.
For dropout, I understand why it could not work, but the nn.Dropout
module itself calls the functional API F.dropout
at each forward
call, so it would seem that each call randomizes the dropped weights, regardless of whether it’s several modules or just the one! See the source code here.