[solved] Dropout erroneously multiplies by 2?

hsg92 · August 16, 2017, 12:44am

Python 2.7.13 |Anaconda 4.4.0 (x86_64)| (default, Dec 20 2016, 23:05:08) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> import torch
>>> torch.__version__
'0.1.12_2'
>>> import torch.nn as nn
>>> from torch.autograd import Variable
>>> v = Variable(torch.randn(2,3))
>>> v
Variable containing:
 0.6709  1.5865  2.2447
-0.1978 -2.0900 -0.8279
[torch.FloatTensor of size 2x3]

>>> dp = nn.Dropout(0.5)
>>> dp
Dropout (p = 0.5)
>>> v_dp = dp(v)
>>> v_dp
Variable containing:
 1.3418  3.1730  0.0000
-0.0000 -4.1799 -1.6558
[torch.FloatTensor of size 2x3]

>>> v
Variable containing:
 0.6709  1.5865  2.2447
-0.1978 -2.0900 -0.8279
[torch.FloatTensor of size 2x3]

Hi

Please notice the above erroneous behaviour of dropout. I am not sure why it multiplies by 2

EDIT: solved
I’m checking for model mode and accordingly adjusting it back, I am not sure tho why this scaling was added. The docs say so that the backward reduces to an identity.

if self.training:
    drop_mask = drop_mask * (1 - config.word_drop)

analvikingur · August 16, 2017, 7:09pm

multiplying by the factor of drop_prob^(-1) is the core idea of dropout, you can read more about it in www.deeplearningbook.org