Sigmoid returns different results depending on type of CPU

I hope this is the right category for my question. I have a very strange problem and I can’t lose the feeling that it is something trivial. I have some code that post-processes some object detection code. If I run it on my main machine (Ubuntu, I5 CPU) everything works as expected. If I run the same code, with the same model and the same input I receive different results on my Coral Dev Board.

I narrowed it down to a sigmoid call. Is there any reason why the sigmoid function on a float32 tensor should return different results on the two different CPUs? And if yes how can I fix it so that the Coral Dev Board runs the same?

It’s normal up to certain point.
Each CPU computes the sigmoid with different hardware and with a different set of instructions.
What’s the error between both?
If it’s smaller than 1e-5 it’s kinda expected.

The problem is that the difference is a lot higher

Found the problem but I’m not sure if it is a bug in PyTorch or my code. After some testing I found out that the sigmoid function returns “wrong” results on large tensors on my Coral Dev Board. If for example I do the following:

a = torch.ones(1000000)

I would receive a result tensor where the first half is ~ 0.7301 and then the second half is ~ 0.43. In my understanding that is not very good as the results differ quite a lot.

My solution was to implement the sigmoid function in numpy but I would prefer to keep it in PyTorch if there is a solution for this.