Hi!
I’ve noticed that when I provide different order of my data at inference-time, the results are slightly different. I’ve put model.eval()
, and everything is fine if the order is the same, but I tried to shuf
since I was observing different results in the flask
server I was using, then I noticed this problem. Is this expected?
Example:
cat data | wc -l # 300.000
for i in $(seq 1 100); do
cat data | inference | md5sum # Always same MD5
done
cat data | shuf | inference | md5sum # Different MD5
comm -3 <(cat data | inference | sort) <(cat data | shuf | inference | sort) | wc -l # 6 -> 3 different results
I’ve checked out similar posts like:
https://discuss.pytorch.org/t/slightly-different-results-when-evaluating-same-model-on-different-machines
https://discuss.pytorch.org/t/slightly-different-results-on-k-40-v-s-titan-x
but they don’t describe exactly the same situation.
GPU: GeForce RTX 2080 Ti
NVIDIA driver version: 525.85.12
CUDA: 12.0
Thank you!