If helpful, I have a collection of implementations in Jupyter Notebooks where most of the multi-layer perceptrons and convnets are based on MNIST. For most, there’s a TensorFlow and a PyTorch implementation if you’d like to compare the two: https://github.com/rasbt/deep-learning-book/tree/master/code/model_zoo/
I haven’t particularly fine-tuned any of the networks, but they seem to perform decently on MNIST. E.g., ~98% test accuracy via the multi-layer perceptron with batch norm and ~99% via a super simple ResNet.