ReLU in MNIST example

I noticed that there is a ReLU here in an MNIST example. It turns a usual softmax into a softmax with non-negative input. I wonder if I’m missing something or it is a reasonable thing to do.

I think it is not an unreasonable thing to to but it probably makes sense to remove it as it might speed up learning by a bit.