Have a look at this post for an example why we are scaling the activations.
Note that the p
in my explanation refert to the keep probability not the drop probability.
4 Likes
Have a look at this post for an example why we are scaling the activations.
Note that the p
in my explanation refert to the keep probability not the drop probability.