Convert PyTorch model to Onnx format (inference not same)

nabsabs · July 14, 2022, 5:59pm

hey @ptrblck – i’m having a similar issue and I’ve used the above code to export with no success.

This is how you can reproduce:

from transformers import AutoModel
import torch
import onnxruntime as ort


def to_numpy(tensor):
    return tensor.cpu().numpy()


model_name = "microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext"

model = AutoModel.from_pretrained(model_name)
model.eval()


ids = torch.randint(low=0, high=30000, size=(1, 128)).type(torch.LongTensor)
mask = torch.ones((1, 128)).type(torch.LongTensor)

torch.onnx.export(
    model,
    (ids, mask),
    "onnx/test.onnx",
    opset_version=13,
    input_names=["ids", "mask"],
    output_names=["output"],
    export_params=True,
    dynamic_axes={
        "ids": {0: "batch_size"},
        "mask": {0: "batch_size"},
        "output": {0: "batch_size"},
    },
)

# get onnx_model outputs
onnx_model = ort.InferenceSession(
    f"onnx/test.onnx", providers=["CPUExecutionProvider"]
)
onnx_input = {
    "ids": to_numpy(ids),
    "mask": to_numpy(mask),
}
onnx_x = onnx_model.run(None, onnx_input)  # [(1,128,768), (1,768)]

# get torch model outputs
x = model(ids, mask)  # [(1,128,768), (1,768)]

# check difference
delta = x[0].shape[0] - onnx_x[0][0]
print(
    delta.min(), delta.max(), delta.mean(), delta.std()
)  # difference in the tensors
assert x[0][0] == onnx_x[0][0]

the output of that script is -5.3224783 14.314435 1.0189513 0.5448166 which shows that the tensors are on average, 1.01 units apart which impacts my downstream task. Do you know what could be the issue?

versions:

transformers       4.20.1
torch              1.12.0
onnx               1.11.0
onnxruntime        1.11.1
onnxruntime-tools  1.7.0