The gradcheck can be sensitive to numerical precision. You could e.g. run it with .double() to somewhat alleviate that (I think this is what the test-suite does, too).
In the second part, you may be on the bound of the required precision and depending on the randomness you pass or not.
The problem is that maxpool is not really differentiable. It is not differentiable at many points (every time the input that is the maximum change).
That means that if your test case hits one of these cases (where two input are very very close), the gradient computed by finite difference will be different from the subgradient given by the backward pass.
Given your eps of 1e-3, the gradient will be wrong if in your input, two values for the same patch have a difference smaller than 1e-3. Since your input is quite large in size and your generate data in a very small range, I guess this happens all the time.