What does model.eval() do for batchnorm layer?

thanks a lot but I don’t want to train the test set just to use its own mean and std, instead of the mean and std of the testing set in other words not to apply model.eval I mean the beta and gamma are fixed and were computed on the test set

is that what you mean?

Hi, smth

I want to know how the Pytorch use the running_mean and running_std to do evaluation.

x = (x - running_mean) / running_std

or

std = m / (m - 1) * running_std # where m is batch size
x = (x - running_mean) / std
1 Like

I think you should use batch norm in front of the relu activations.

I have the same problem when I use model.eval() in test time. I used BN layer like this:

def conv_block(self, in_channels, out_channels, kernel_size=3, stride=2, padding=1):
        block = nn.Sequential(
            nn.Conv2d(in_channels=in_channels, 
                      out_channels=out_channels,
                      kernel_size=kernel_size, 
                      stride=stride,
                      padding=padding),
            nn.BatchNorm2d(num_features=out_channels),
            nn.ReLU()
        )
    return block

and then used it as:

self.block1 = self.conv_block(in_channels=1, 
                                        out_channels=32, 
                                        kernel_size=7,
                                        stride=2, 
                                        padding=3)
self.block2 = self.conv_block(in_channels=1,
                                        out_channels=32,
                                        kernel_size=7, 
                                        stride=2, 
                                        padding=3)

Is it a problem? I don’t think so. When I do not use model.eval() the results are good but when I use it, it decrease the performance drastically?

My batch size is 64 and I test the model on the same trainig data.