Have a look at this post for an example why we are scaling the activations.
Note that the p in my explanation refert to the keep probability not the drop probability.
Have a look at this post for an example why we are scaling the activations.
Note that the p in my explanation refert to the keep probability not the drop probability.