Mismatch when converting TransformerEncoder to onnx

Mohamed_Hassan · September 18, 2023, 6:24pm

I am convering a model containing TransformerEncoder. However I noticed that the outputs I get in PyTorch and onnxruntimes are different.

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        trans_enc_layer = nn.TransformerEncoderLayer(d_model=32,
                                                     nhead=4,
                                                     dim_feedforward=64)

        self.seqTransEncoder = nn.TransformerEncoder(trans_enc_layer,
                                                     num_layers=1)

    def forward(self, x):
        return self.seqTransEncoder(x)


model = Model()
x = torch.randn(7, 1, 32)
y = model(x)

output_file_name = 'model.onnx'
torch.onnx.export(
    model,
    x,
    output_file_name,
    training=torch.onnx.TrainingMode.EVAL,
)
ort_session = ort.InferenceSession(output_file_name)
ort_inputs = {ort_session.get_inputs()[0].name: to_numpy(x)}
ort_outs = ort_session.run(None, ort_inputs)
np.testing.assert_allclose(to_numpy(y), ort_outs[0], rtol=1e-03, atol=1e-05)

When I run the code above, I get the following

Not equal to tolerance rtol=0.001, atol=1e-05

Mismatched elements: 224 / 224 (100%)
Max absolute difference: 0.96960294
Max relative difference: 14.14618

ptrblck · September 18, 2023, 8:52pm

You are not calling model.eval() on the PyTorch model before creating the reference output and are thus using the default dropout layer. Check if this would reduce the numerical mismatches.

Mohamed_Hassan · September 18, 2023, 10:01pm

Thanks a lot. That was the issue indeed.