ONNX Model Gives Different Outputs in Python vs Javascript

kheyer · April 3, 2022, 12:25am

I’m trying to deploy a small GAN model in browser using ONNX runtime / javascript. I’m running into an issue where the same model.onnx file gives different outputs in python vs javascript for runtime. I’m looking for help on figuring out why this is.

The model takes as input a random normal vector as input and returns a 8x8x4 LongTensor as the image. To test the runtime differences, I fed the same input vector to the model. I looked at the sum of the output array and the number of zero values to compare.

Model code:

class InferenceModel(nn.Module):
    def __init__(self, layers):
        super().__init__()
        self.layers = layers # layers are just a stack of linear layers
        
    def forward(self, input):
        input = input.unsqueeze(0)
        # run vector through layers, clamp to 0,1, multiply by 255 and convert to long
        output = (self.layers(input).clamp(0,1) * 255).type(torch.LongTensor)
        output = output.view((8,8,4)) # reshape as image
        return output

im = InferenceModel(layers)
im.eval() # eval called before onnx trace
im.cpu()

dummy_input = torch.randn((256)
torch.onnx.export(im, args=dummy_input, f='model.onnx', verbose=True, input_names=['input'], output_names=['output'])

ONNX python inference:

import onnx, onnxruntime
import numpy as np

session = onnxruntime.InferenceSession('model.onnx', None)
output_name = session.get_outputs()[0].name
input_name = session.get_inputs()[0].name

# for testing, input array is explicitly defined
inp = np.array([ 1.9269153e+00,  1.4872841e+00, ...]) 

result = session.run([output_name], {input_name: inp})

result[0].sum() # use sum over entire array to show difference between this and JS runtime
> 8900

(result[0] == 0).sum() # check number of zero values
> 203

Javascript Inference:

<!DOCTYPE html>
<html>
  <header>
  </header>
  <body>
    <script src="https://cdn.jsdelivr.net/npm/onnxruntime-web/dist/ort.min.js"></script>
    <script src="script.js"></script>
  </body>
</html>

script.js:

async function run() {
    try {
      const session = await ort.InferenceSession.create('./model.onnx');
  
      const dims = [256];
      const size = dims[0];
      const inputData = new Float32Array([ 1.9269153e+00,  1.4872841e+00,  ...]) # exact same input array, explicitly defined
  
      const feeds = { input: new ort.Tensor('float32', inputData, dims) };
  
      const results = await session.run(feeds);
      const arr_sum = results.output.data.reduce((partialSum, a) => partialSum + a, 0); # sum over array
      console.log(arr_sum);

      var num_zeros = 0;
      for (var i=0; i < results.output.data.length; i++){
        if (results.output.data[i] == 0){
          num_zeros++
        }
      }

      console.log(num_zeros);

      return results
    } catch (e) {
      console.log(e);
    }
  }

run();

> 2982
> 239

When run in python, the set input vector returns an array that sums to 8900 with 203 zero value elements. When run in javascript with the same input vector, the code returns an array that sums to 2982 with 239 zero value elements. I’m trying to figure out why this difference is occurring. Any help is appreciated.