Extract BatchNorm1d layer information and calculate output mannualy

Dear All,
For curiosity, I extracted network layers and calculated output by hand, given a new input.
A very simple regression model:

NeuralNet(
  (l0): Linear(in_features=6, out_features=256, bias=True)
  (relu): ReLU()
  (l00): Linear(in_features=256, out_features=1, bias=True)
)

My manual calculation:

    ReLU = lambda x: np.maximum(0.0, x)
    # GPU torch.Tensor to CPU numpy ndarray
    X_data = X_valid.cpu().numpy()
    # First layer
    W0 = model.l0.weight.cpu().detach().numpy()
    b0 = model.l0.bias.cpu().detach().numpy()

    # Final Layer
    W00 = model.l00.weight.cpu().detach().numpy()
    b00 = model.l00.bias.cpu().detach().numpy()

    # First output
    L0 = np.dot(W0, np.transpose(X_data)) + np.tile(np.reshape(b0, (-1, 1)), X_data.shape[0])
    L0 = np.array(list(map(ReLU, L0)))

    # Final output
    L00 = np.dot(W00, L0) + np.tile(np.reshape(b00, (-1, 1)), X_data.shape[0])
    L00 = np.array(list(map(ReLU, L00)))

It works. I get the same results compared with model(X_data).
However, when I added a BatchNorm1d layer,

NeuralNet(
  (l0): Linear(in_features=6, out_features=256, bias=True)
  (bn0): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU()
  (l00): Linear(in_features=256, out_features=1, bias=True)
)

and used the following script to calculate out, it did not work. I ended up with very different results compared with model(X_data).

    ReLU = lambda x: np.maximum(0.0, x)
    # GPU torch.Tensor to CPU numpy ndarray
    X_data = X_valid.cpu().numpy()
    # First layer
    W0 = model.l0.weight.cpu().detach().numpy()
    b0 = model.l0.bias.cpu().detach().numpy()

    # Final Layer
    W00 = model.l00.weight.cpu().detach().numpy()
    b00 = model.l00.bias.cpu().detach().numpy()

    # First output
    L0 = np.dot(W0, np.transpose(X_data)) + np.tile(np.reshape(b0, (-1, 1)), X_data.shape[0])
    L0 = np.array(list(map(ReLU, L0)))
    
    # Batch Normalization Layer
    bn_mean=model.bn0.running_mean.cpu().numpy()
    bn_var=model.bn0.running_var.cpu().numpy()
    bn_gamma=model.bn0.weight.cpu().detach().numpy()
    bn_beta=model.bn0.bias.cpu().detach().numpy() # previously a typo here!
    bn_epsilon = model.bn0.eps
    # Reshape for Matrix calculation
    bn_mean = np.tile(np.reshape(bn_mean, (-1, 1)), L0.shape[1])
    bn_var = np.tile(np.reshape(bn_var, (-1, 1)), L0.shape[1])
    bn_gamma = np.tile(np.reshape(bn_gamma, (-1, 1)), L0.shape[1])
    bn_beta = np.tile(np.reshape(bn_beta, (-1, 1)), L0.shape[1])
    
    L0 = np.multiply(np.divide(L0-bn_mean,np.sqrt(bn_var+bn_epsilon)), bn_gamma)+bn_beta

    # Final output
    L00 = np.dot(W00, L0) + np.tile(np.reshape(b00, (-1, 1)), X_data.shape[0])
    L00 = np.array(list(map(ReLU, L00)))

What’s wrong with my BatchNorm1d layer calculation? I used the formula I found in the PyTorch website.

In training mode nn.BatchNorm layers will use the batch statistics and update the running estimates.
If you call .eval() on this layer or your complete model, the running estimates will be used.
Are you comparing a training run of your model with a manual eval implementation?

1 Like

Edit: solved!

Hi,

I did call model.eval() before manual calculation and model prediction.

When we run ‘model.eval()’ and predict, what BatchNorm layer statistics is used? Are running_mean and running_variance used at prediction?