How to fix the Dropout mask for different batch

fixedrl · September 6, 2017, 4:40pm

Suppose we have 2 minibatch (each with 10 data point). When turning on the dropout for forward pass of first minibatch, the dropout mask with dimension 10 is generated. What if we want to use the same mask for the second batch of data ?

smth · September 11, 2017, 2:13am

you can do this dropout operation yourself, instead of using nn.Dropout.

You can generate a bernoulli mask of numbers using torch.bernoulli and then multiply your both mini-batches with the same mask.

For example:

# generate a mask of same shape as input1
mask = Variable(torch.bernoulli(input1.data.new(input1.data.size()).fill_(0.5)))

output1 = input1 * mask
output2 = input2 * mask

fixedrl · October 4, 2017, 1:36pm

Is it correct to rescale the mask to output the same magnitude in following way ?

mask = Variable(torch.bernoulli(input1.data.new(input1.data.size()).fill_(0.4)))/0.6

tom · October 4, 2017, 1:46pm

Looks good to me…

Best regards

Thomas

rdroste · May 7, 2018, 11:13am

For future readers I would like to mention that the rescaling is not correct.
Please note that the Bernoulli distribution samples 0 with the probability (1-p), contrary to dropout implementations, which sample 0 with probability p.

Therefore, if you want dropout with p=0.4, the mask has to be
mask = Bernoulli(torch.full_like(input1, 0.6)).sample()/0.6

For dropout with p=0.6 the mask is
mask = Bernoulli(torch.full_like(input1, 0.4)).sample()/0.4

seyeeet · November 20, 2020, 7:05pm

what is .sample() ? in your code?

Torcione · March 19, 2022, 5:24pm

Hi,
if I understand correctly, this way I generate a different mask (i.e. dropout) for each element in the batch (since input1.data.size() also contains the size relative to the number of images in the batch); shouldn’t I rather apply the same mask to each element in the batch?