Hi guys
I am currently trying to rebuild a network from tensoflow in pytorch. The library onnx2pytorch worked for me, but I would like to build a replica class from the same network in pytroch. The idea is to use this class in order to test inference with different number systems like bfloat or posit.
What I did so far:
- Converting tf model to pytroch
from onnx2pytorch import ConvertModel
os.environ['TF_KERAS'] = '1'
pretrained_model = tf_load_model(config['model_list']['dna821_fp32'])
onnx_model = onnxmltools.convert_keras(pretrained_model)
pretrained_model = ConvertModel(onnx_model, experimental=True)
This leads to the following model which has the same accuracy and size as in tensorflow:
ConvertModel(
(Transpose_model/conv2d/BiasAdd__29:0): Transpose()
(Conv_model/conv2d/BiasAdd:0): Conv2d(3, 8, kernel_size=(7, 7), stride=(1, 1))
(Relu_model/conv2d/Relu:0): ReLU(inplace=True)
(MaxPool_model/max_pooling2d/MaxPool:0): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
(Conv_model/conv2d_1/BiasAdd:0): Conv2d(8, 150, kernel_size=(4, 4), stride=(1, 1))
(Relu_model/conv2d_1/Relu:0): ReLU(inplace=True)
(MaxPool_model/max_pooling2d_1/MaxPool:0): MaxPool2d(kernel_size=(6, 6), stride=(6, 6), padding=0, dilation=1, ceil_mode=False)
(Transpose_model/max_pooling2d_1/MaxPool__43:0): Transpose()
(Reshape_model/flatten/Reshape:0): Reshape(shape=[ -1 1350])
(MatMul_model/dense/BiasAdd:0): Linear(in_features=1350, out_features=340, bias=True)
(Relu_model/dense/Relu:0): ReLU(inplace=True)
(MatMul_model/dense_1/BiasAdd:0): Linear(in_features=340, out_features=490, bias=True)
(Relu_model/dense_1/Relu:0): ReLU(inplace=True)
(MatMul_model/dense_2/BiasAdd:0): Linear(in_features=490, out_features=43, bias=True)
(Softmax_dense_2): Softmax(dim=-1)
)
- I build with this information my model class in pytroch
import torch.nn as nn
class GTRSB(nn.Module):
def __init__(self):
super().__init__()
self.seq = nn.Sequential(
nn.Conv2d(in_channels=3, out_channels=8, kernel_size=(7, 7), stride=(1, 1)),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False),
nn.Conv2d(in_channels=8, out_channels=150, kernel_size=(4, 4), stride=(1, 1)),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=(6, 6), stride=(6, 6), padding=0, dilation=1, ceil_mode=False),
nn.Flatten(),
nn.Linear(in_features=1350, out_features=340, bias=True),
nn.ReLU(inplace=True),
nn.Linear(in_features=340, out_features=490, bias=True),
nn.ReLU(inplace=True),
nn.Linear(in_features=490, out_features=43, bias=True),
nn.Softmax(dim=-1)
)
def forward(self, x):
return self.seq(x)
- I initialized the GTRSB model and copied the parameters from the converted model to my model which works without errors:
pretrained_layer_name = [name for name, param in pretrained_model.named_parameters()]
pretrained_state = pretrained_model.state_dict()
model = GTRSB()
model_state_dict = model.state_dict()
model_layer_name = [name for name, param in model.named_parameters()]
for i in range(len(model_state_dict)):
model_state_dict[model_layer_name[i]] = pretrained_state[pretrained_layer_name[i]]
model.load_state_dict(model_state_dict)
leads to:
GTRSB(
(seq): Sequential(
(0): Conv2d(3, 8, kernel_size=(7, 7), stride=(1, 1))
(1): ReLU(inplace=True)
(2): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
(3): Conv2d(8, 150, kernel_size=(4, 4), stride=(1, 1))
(4): ReLU(inplace=True)
(5): MaxPool2d(kernel_size=(6, 6), stride=(6, 6), padding=0, dilation=1, ceil_mode=False)
(6): Flatten(start_dim=1, end_dim=-1)
(7): Linear(in_features=1350, out_features=340, bias=True)
(8): ReLU(inplace=True)
(9): Linear(in_features=340, out_features=490, bias=True)
(10): ReLU(inplace=True)
(11): Linear(in_features=490, out_features=43, bias=True)
(12): Softmax(dim=-1)
)
)
Problems
My network (model) somehow could not take the input shape as the converted one (pretrained_model). I have to change my matrix shape from (12630, 48, 48, 3) to (12630, 3, 48, 48) in order to do the inference. Furthermore, the performance is not the same. My network has a much lower accuracy then the converted one.
I assume that I have some errors in my GTRSB class.
The converted network has a few layer more in the descrption then my network. As an example there is the
(Transpose_model/conv2d/BiasAdd__29:0): Transpose()
layer (or I think at least it is one) in the begin of the summary. The same layer shows up in the middle of the summary again. I think this could lead to the behavior. Does someone of you have an idea what the error could be or how to fix my model class?
Thanks a lot