Questions about dropout

Hello, everyone

I have questions about Dropout.

  1. Should I use different dropout mask when forwarding batch data?
    That is, should I generate different dropout mask for each mini-batch data at the same layer?

  2. Should the dropout mask when doing backpropagation be the same as the dropout mask when doing forward processing?
    That is, should I save the dropout masks when forwarding and apply the same mask to the corresponding layer when doing backpropagation?

  3. Should I not use dropout to the output layer?
    Should I not use dropout to the input?

Thank you in advance and have a nice weekend

  1. A dropout mask is to be used can be applied to the layers by multiplying a tensor of the same size elementwise, this comes in your forward() method so as to apply for the layers irrespective of the input
  2. When you create a dropout mask and apply it to forward propagations the neurons will be inactive or ‘0’. so, while calculating the gradients it will not propagate back through the dropout mask
  3. Dropout can’t be used on output layers
1 Like

Thank you for your answer.
And this question is assuming that I am not using PyTorch nor TensorFlow.
That is, I am implementing dropout algorithm by myself for studying.

Yes, I understood when you were referring to dropout mask instead of dropout layer

1 Like

Thank you…!