Hello!
My case is to convert some model containing Recurrent module (GRU) from pytorch
to caffe2
.
I see that the only way of doing that is by using onnx
.
My smallest example is the following:
import sys
import numpy as np
import torch
import torch.onnx
import onnx
import caffe2.python.onnx.backend as backend
from caffe2.python.onnx.backend import Caffe2Backend
def main():
# step 0: prepare model which takes sequences of size 8 as input and has 1 forward hidden layer of size 4.
model_pytorch = torch.nn.GRU(input_size=8,
hidden_size=4,
num_layers=1)
model_pytorch.eval()
x = torch.randn(2, 1, 8) # seq_len x batch x input_size
h = torch.zeros(1, 1, 4) # (num_layers * n_directions) x batch x hidden_size
try:
_ = model_pytorch(x, h) # checking that model inference is OK
except (Exception, RuntimeError) as e:
print(e)
print(' ===== Unsuccessfull model inference run, exiting ===== ')
return
finally:
print(' ===== Step 0 finished ===== ')
# step 1: convert to ONNX
onnx_proto_output = "temp.onnx"
try:
torch.onnx.export(model_pytorch, (x, h), onnx_proto_output, export_params=True, verbose=True)
except (Exception, RuntimeError) as e:
print(e)
print(' ===== Unsuccessfull pytorch->ONNX run, exiting ===== ')
return
finally:
print(' ===== Step 1 finished ===== ')
# step 2: check ONNX model using Caffe2-ONNX backend
model_onnx = onnx.load(onnx_proto_output)
print(onnx.checker.check_model(model_onnx))
print(onnx.helper.printable_graph(model_onnx.graph))
x_ = x.numpy()
h_ = h.numpy()
try:
outputs = backend.run_model(model_onnx, (x_, h_))
print(outputs)
except (Exception, RuntimeError) as e:
print(e)
print(' ===== Unsuccessfull Caffe2.onnx run, exiting ===== ')
return
finally:
print(' ===== Step 2 finished ===== ')
# step 3: save model to caffe2 format
try:
input_size_net, predict_net = Caffe2Backend.onnx_graph_to_caffe2_net(model_onnx)
with open('init_net.pb', "wb") as f:
f.write(init_net.SerializeToString())
with open('predict_net.pb', "wb") as f:
f.write(predict_net.SerializeToString())
except (Exception, RuntimeError) as e:
print(e)
print(' ===== Unsuccessfull Caffe2 save run, exiting ===== ')
return
finally:
print(' ===== Step 3 finished')
if __name__ == '__main__':
main()
It fails on line 57 at command run_model(...)
of step 2:
ONNX FATAL: list index out of range
My small research shows that the error is around the line 425 of this file:
424 if x.name == W:
425 input_size = x.type.tensor_type.shape.dim[2].dim_value
426 break
It turns out that in this case the matrix which we’re considering has only 2 dims (12x8):
name: "10"
type {
tensor_type {
elem_type: FLOAT
shape {
dim {
dim_value: 12
}
dim {
dim_value: 8
}
}
}
}
I wonder if there’s versions mismatch between the onnx/caffe2 ways of handling the models?
If so, what’s the good way of fixing that?
I use pytorch 0.4.0
, onnx 1.2.1
(latest from source), caffe2 (latest from source).
Thanks!