Frequencey of evaluation steps affecting PyTorch model accuracy

The small differences in accuracy might be caused by a different pseudorandom number generator behavior, which could be caused by the BaseDataLoaderIter as seen in this post.
Could you check if re-seeding the code directly before calling train_one_epoch would yield the same performance?