My server crashed after running this code?

Also, I got same problems with accuracy-difference-on-multi-gpu-with-nn-dataparallel

For a same model, the result of training on one GPU is 0.582, and the result of training on 8 GPUs is 0.616. The difference is too big… The results are all reproducible. I don’t know which is “correct”.