How is dropout implemented?


I am wondering where is dropout exactly added. For fc layers of y=Wx+b, dropout randomly drops the parameters of in W matrix rather than the feature x, and for conv layers, dropout randomly drops the parameters in the convolution kernels rather than the feature maps. Is this what exactly happens in the dropout layers?

nn.Dropout will randomly zero out some elements of the input, not the weights.
The same applies for 4-dimensional tensors (e.g. conv outputs) using nn.Dropout.
If you want to zero out complete channels, you should use nn.Dropout2d (docs).

1 Like