I’m trying to convert a simple model (involving conv and gru layers) in pytorch to an onnx model, and the load it to Caffe. If I use the full trained model the conversion and the Caffe loading works fine. However, I want to drop the final layer of the model. When I do this and try to load it to Caffe, I get the error “RuntimeError: Inferred shape and existing shape differ in rank: (0) vs (2)”.
The procedure I describe is:
-
Train the full model
-
Save it to a .pth file
-
Create a new model object without the final layer
-
Load the state_dict in .pth (with strict=False, if not when trying to load it pytorch complains about the final layer being missing).
-
Export it to onnx
-
Load the onnx model to Caffe
I use the onnx.checker.check_model(model) function to check if everything’s ok but so far I don’t get any output from it. Then, the program crashes when executing
prepared_backend = onnx_caffe2_backend.prepare(model)
where I get the error stated above.
What bothers me about this error is that I’m not sure about which layer is complaining.
EDIT:
Tried to create the model without the final layer and export it right away, without training. Same problem. Thought it would be an issue with how the state dict is loaded but doesn’t look like it.
EDIT2:
Here’s the forward code:
x = self.embeddings(x)
x = x.permute(0, 3, 1, 2)
x = self.conv1(x).squeeze(3).permute(2, 0, 1)
outputs, hidden = self.gru1(x)
if self.with_output_layer:
output = outputs.permute(1, 0, 2)[:, 25, :]
return self.out_layer(output)
else:
# sum last state from forward and backward direction
return hidden[3, :, :] + hidden[2, :, :]