Different inference results when running on arm64 and x86_64


I get different inference results when running my c++ code on arm64 and x86_64. I build pytorch from source as described here https://pytorch.org/mobile/ios/#build-pytorch-ios-libraries-from-source

The results on x86_64 are actually correct and the results on arm64 are incorrect. I double checked in the python world.

I am stepping in the dark, so any hint is most appreciated!

can you provide the code ?


  • load the model:
    module = torch::jit::load(fileName);

  • process:
    torch::Tensor result;
    std::vectortorch::jit::IValue inputs;
    at::Tensor tensor = torch::from_blob(data.data(), {1, 1, 11, 144});
    torch::autograd::AutoGradMode guard( false );
    result = module.forward(inputs).toTensor();

i have written unit tests and integrations tests that show that the results are equal to python. but once i run this on the iphone the results are different.

How are you creating the data and how large is the difference?

I am creating the data from a std::vector< float > . It contains 11 spectrums with the size of 144.

The difference is quite big, by something like factor of 2.

So arm64 and amd64 will use different backends. It is quite possible that you found a bug in the arm64 one, in particular if you use less-common modules. (e.g. I had that with transposed convs a year ago on arm32, where a network would run fine on amd64 but the output was messed up on my phone.)
I know it is a lot of work, but if you want, the ideal reproducing case would be to narrow down the network to a single module where things go wrong and then provide Module+Parameters and inputs. This would also limit how much you need to tell us about it. :wink:

Best regards