Hi!

I’ve noticed that when I provide different order of my data at inference-time, the results are slightly different. I’ve put `model.eval()`

, and everything is fine if the order is the same, but I tried to `shuf`

since I was observing different results in the `flask`

server I was using, then I noticed this problem. Is this expected?

Example:

```
cat data | wc -l # 300.000
for i in $(seq 1 100); do
cat data | inference | md5sum # Always same MD5
done
cat data | shuf | inference | md5sum # Different MD5
comm -3 <(cat data | inference | sort) <(cat data | shuf | inference | sort) | wc -l # 6 -> 3 different results
```

I’ve checked out similar posts like:

https://discuss.pytorch.org/t/slightly-different-results-when-evaluating-same-model-on-different-machines

https://discuss.pytorch.org/t/slightly-different-results-on-k-40-v-s-titan-x

but they don’t describe exactly the same situation.

GPU: GeForce RTX 2080 Ti

NVIDIA driver version: 525.85.12

CUDA: 12.0

Thank you!