Results become worse after using dropout

Hi, before using dropout, my model works pretty well on both training and test set. However, after using dropout even with p=0.1, the results on test set became much worse (training set seems not to be affected). I suppose that it’s more likely due to the implementation rather than overfitting etc. I made sure that I used model.eval() during validation/testing. I used dropout in my model like below:

   self.conv = nn.Conv3d(in_channels, out_channels, kernel_size=kernel_size, dilation=dilation)

   self.bn = nn.BatchNorm3d(out_channels)
   self.do_act = do_act
   self.if_drop = if_drop
   if self.do_act:
     self.act = nn.PReLU()

   if self.if_drop:
      self.dropout = nn.Dropout3d(drop_rate)

def forward(self, input):
   out = self.bn(self.conv(input))

   if self.do_act:
      out = self.act(out)
   if self.if_drop:
      out = self.dropout(out)

   return out

Thanks!

May not be a bug. I think batchnorm generally (conceptually) doesn’t play nicely with Dropout. Maybe do an ablation study

  1. no batchnorm, no dropout
  2. just batchnorm, no dropout
  3. just dropout, no batchnorm
  4. both batchnorm and dropout

and see if you get further insights whether there’s a bug or not. If you have a bug, I suspect 3) will be much worse than 1) etc.

As a side note, can you make the bias of conv layers before Batchnorm False. As batchnorm needs to standardize the layer outputs, having a bias increases a parameter and makes it difficult for the network to learn.