How to keep the same randomness in dropout？

klion · August 26, 2022, 2:36am

I was hoping that the following two scripts would get the same model, but it seems that due to the randomness of the dropout, I am not able to get the same model.

The code abstraction is as follows

# script 1
for t in range(100):
    vec = torch.randn([1,500])
    s = compute_score(method="std", vec=vec)
    
    some_functions(s)

    TrainNetwork(model)

# script 2
for t in range(100):
    vec = torch.randn([1,500])
    s = compute_score(method="NoE", vec=vec)
    
    some_functions(s)

    TrainNetwork(model)

In the function compute_score I set nu=0, so the two different methods, “NoE” and “std”, should give the same result.

import torch
from torch import nn

class NetworkFC2(nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.fc1 = nn.Linear(500, 100)
        self.fc2 = nn.Linear(100, 1)
        self.dropout = nn.Dropout(p = 0.2)
    
    def forward(self, vec):
        """
        vec.shape = (1, 500)
        """
        o1 = self.dropout(self.fc1(vec))
        return self.fc2(o1)

torch.manual_seed(123)

model = NetworkFC2()

def compute_score(method, vec):
    if method == "std":
        model.train()
        score_v = torch.zeros(20)
        for i in range(20):
            score_v[i] = model(vec)
        std = score_v.std()

        model.eval()
        s0 = model(vec)

        nu = 0
        score = s0 + nu * std
    
    elif method == "NoE":
        model.eval()
        score = model(vec)

    return score        

def TrainNetwork(model):
    pass

def some_functions(score):
    pass

But since the number of dropout runs in the “std” method is more than in the “NoE” method, the randomness of the dropout is different and I can’t get the same model.

Is there a way to solve this problem?

Thanks in advance for any help/suggestion.

ptrblck · August 26, 2022, 5:01am

You are right that the different number of calls into the pseudorandom number generator is causing the difference in the outputs.
One approach would be to sample the dropout masks manually and select the ones for the corresponding iteration, another one could be to explicitly seed the code before the dropout execution with a specific seed.
However, since your use case forces you to use different number of iterations, it would be interesting to learn more about the actual expectation.
Is one method creating a single output sample in the loop while the other one uses a batched approach?

klion · August 26, 2022, 5:41am

Thanks for your answer, in fact, I am doing research on uncertainty in networks and would like to use the outputs of dropout to study it.