I’m trying to convert an old TensorFlow 1.x model to PyTorch. I have created a script to export the TF model weights and load them to the PyTorch model. However, I still have problems with the batch normalization. This is my old source code:
def cnn_block(inputs, filters, kernel_size):
cnn = tf.layers.conv2d(
inputs=inputs,
filters=filters,
kernel_size=kernel_size,
padding="same",
activation=None,
use_bias=False
)
output = tf.layers.batch_normalization(
inputs=cnn,
momentum=0.9,
epsilon=1e-5,
center=True,
scale=True,
)
return output
I have implemented the following PyTorch code:
import torch
from torch import nn
class CNNBlock(nn.Module):
def __init__(self, in_channels: int, out_channels: int, kernel_size: int):
super().__init__()
self.model = nn.Sequential(
nn.Conv2d(
in_channels=in_channels,
out_channels=out_channels,
kernel_size=kernel_size,
padding=kernel_size // 2,
bias=False
),
nn.BatchNorm2d(
num_features=out_channels,
eps=1e-5,
momentum=0.9,
affine=True
)
)
def forward(self, tensor: torch.Tensor) -> torch.Tensor:
return self.model(tensor)
However, the nn.BatchNorm2d()
method returns different results than tf.layers.batch_normalization()
method.
In the load weight routine, I copy the gamma
value to model.weight
, beta
value to model.bias
, moving_mean
value to model.running_mean
, and moving_variance
value to model.running_var
. However, the results are quite differents.
If I comment the batch normalization in both source codes, the Conv2d()
returns basically the same value.