Simple neural network coding for tic-tac-toe

The inference coding gave out repeated wrong results, it seems like the model output trained by the training code is wrong. Any idea why ?