I am studying dropout algorithm
I am implementing dropout using Python or Julia.
If I have two hidden layers and dropout ratios are 0.5, 0.3 respectively,
Then how much ratio should I multiply to the output when evaluating?
Also, should (may) I multiply some ratio (such as 1/(1-p) for some appropriate p) to the output when training?
- When using Pytorch, when I use dropout, I have never multiplied some ratio to the output when evaluating? Does PyTorch automatically multiply some ratio to the output when evaluating?
Thank you in advance