GaussianDropout implementation

Mark_Esteins · May 15, 2022, 10:29pm

Is this a proper way of implement Gaussian Dropout or I’m missing something?

class GaussianDropout(nn.Module):
    def __init__(self, alpha=1.0):
        super(GaussianDropout, self).__init__()
        self.alpha = alpha
        
    def forward(self, x):
        if self.training:
            epsilon = torch.randn_like(x) * self.alpha + 1
            return x * epsilon
        else:
            return x

Would appreciate any advice or clarification on this one.

Thanks in advance.

tom · May 16, 2022, 6:27am

The PyTorch bits seem OK.

But one thing to consider is whether alpha is that descriptive a name for the standard deviation and whether it is a good parameter convention.
PyTorch’s standard dropout with Bernoulli takes the rate p. The multiplicator will have mean 1 and standard deviation (p * (1-p))**0.5 / (1-p) = (p/(1-p))**0.5 (on the left side, the numerator (p*(1-p))**0.5 is the standard deviation of the Bernoulli and the denominator 1-p comes from the scaling.
So if you want to match more what (Bernoulli) Dropout does in terms of mean and std, you could take an argument p and use the standard deviation (p/(1-p))**0.5 instead of self.alpha.
(I think e.g. Keras’ Gaussian dropout does that, too.)

Best regards

Thomas

Mark_Esteins · May 16, 2022, 8:30am

Thanks @tom if I understood it correctly and looking at the Keras code, your suggestion would look like this right?

class GaussianDropout(nn.Module):
    def __init__(self, p=0.5):
        super(GaussianDropout, self).__init__()
        self.p = p
        
    def forward(self, x):
        if self.training:
            assert 0 < p < 1
            mean = torch.ones_like(x)
            stddev = (self.p / (1.0 - self.p))**0.5
            epsilon = torch.normal(mean = mean, std = stddev) 
            return x * epsilon
        else:
            return x

tom · May 16, 2022, 9:40am

Almost!
I’d do the assert (or better RuntimeError, assert should be for “internal” dev errors) in the init and probably would just use torch.randn_like(…) * stddev. Other than that it looks good to me at first sight.

Best regards

Thomas

Mark_Esteins · May 16, 2022, 10:19am

Thanks! Will leave the final implementation here in case it’s useful for someone else.

class GaussianDropout(nn.Module):
    def __init__(self, p=0.5):
        super(GaussianDropout, self).__init__()
        if p <= 0 or p >= 1:
            raise Exception("p value should accomplish 0 < p < 1")
        self.p = p
        
    def forward(self, x):
        if self.training:
            stddev = (self.p / (1.0 - self.p))**0.5
            epsilon = torch.randn_like(x) * stddev
            return x * epsilon
        else:
            return x

tom · May 16, 2022, 10:38am

Thank you, it’ll be valuable for the next person!

P.P.S.: For v2 of the code, I’d probably allow p=0. It can be handy to use a hyperparameter to disable dropout for experimentation (even if not having the module would be more efficient, but changing the structure this can lead to hickups e.g. with nn.Sequential when you want to compare parameters etc.).