Output of a piecewise discrete function as a classification problem

I have synthetic data for a piecewise discrete function that bins a feature into bins. So the following:

def bin(x, min_val, bin_width, num_classes):
   return min(num_classes - 1, max(0, (x - min_val) // bin_width)))

I am setting this up as a classification problem with target between 0 and num_classes - 1. I was expecting that a 2 layer fully connected net would be able to predict the class (which is the bin number) accurately. But I am getting disappointing results.

Any ideas on what I am doing wrong here?