How to have deterministic dropout?

seyeeet · November 14, 2020, 4:29pm

I want to use dropout on input1 and input2 (where they are both tensors with similar sizes) but I want it to be the same dropout for both of them. meaning that it zeros(drops) the same elements on both of them. how can i do that?

ptrblck · November 15, 2020, 10:43am

I would sample a mask manually and apply it to both tensors (with the scaling, if needed).
Alternatively, you could also try to seed the dropout call before its usage, but I would rather avoid these kind of seed-hacks.

seyeeet · November 17, 2020, 3:22am

I see thank you, would that be possible to see what elements the dropout will zero out if we know the size of tensor?
I mean I can check the output of tensor to see where it is zero but that is kinda risky (if the tensor has zero elements then that would cause issue, though it is not too difficult to check I guess).
if I can get the mask from the dropout then the problem is solved.

ptrblck · November 17, 2020, 6:43am

I think you can use your mentioned approach, but I would rather sample the mask manually instead of checking the input and output for zeros.

seyeeet · November 20, 2020, 7:26pm

just to confirm is this a right way of creating mask:
let say I want to do dropout with p=0.1

import torch
input1 = torch.rand(5,2,2)
mask = torch.bernoulli(input1.data.new(input1.data.size()).fill_(1-p))
input_after_dropout = mask*input1

ptrblck · November 20, 2020, 9:51pm

torch.bernoulli expects an input tensor containing the probabilities of drawing a 1 value. So depending how p is defined your code should be correct.
Note that you would have to take care of the scaling in dropout layers (either the inverse scaling during training or the vanilla scaling during evaluation).
To switch the behavior between training and evaluation you could create a custom nn.Module and use the internal self.training flag to switch between the behaviors.
The self.training flag will be changed through calls into model.train() and model.eval().