when I have a network with 5 layers that should have dropout, do I need one separate nn.Dropout instance for each layer or can I just reuse one? (assuming all have the same dropout rate)
Is there an overview or simple rule how I can decide which layers can be safely reused and which not? I mean, layers as ReLU or Tanh have no internal state, so they can be reused. Layers like Conv2D or Linear have weights, so they should not be reused (except I want weight sharing?)
But with Dropout I am not sure. Does this layer need to remember which connections were removed in the forward() pass in order to do the backward pass correctly?
Sorry if this is a stupid question, I could not find an answer and I want to be sure that my networks are as simple as possible.